Replace rows in one dataframe with rows from a different dataframe through a loop

Question

I am trying to replace rows in a dataframe with rows from another dataframe. I have an excel file with all the existing product code in column 0 called 'MASTER.xlsx', and where the remaining columns are empty. I have another excel file called 'COUT PROJET - HOTEL DE VILLE.xlsx' containing some of the product codes in column 0 and where the remaining columns are filled with values.

Ultimately, I want to iterate through both the 'MASTER.xlsx' and 'COUT PROJET - HOTEL DE VILLE.xlsx' files. When the product code is in both files, I want to replace that respective row in 'MASTER.xlsx' with the filled out row from 'COUT PROJET - HOTEL DE VILLE.xlsx'. When the product code is not in 'COUT PROJET - HOTEL DE VILLE.xlsx', I want that row in 'MASTER.xlsx' to remain unchanged (empty).

import numpy as np
import pandas as pd
import time
import glob

df_master = pd.read_excel('MASTER.XLSX')

df = pd.read_excel('COÛT PROJET - HÔTEL DE VILLE.xlsx')

for index, column in df.iterrows(): 
        for index, row in df_master.iterrows():
            if row['DATE :'] == column['DATE :']:
                df_master.update(df)
            else:
                continue
                
        
df_master.to_excel('UPDATED COÛT PROJET - HÔTEL DE VILLE.xlsx')

The current code seems to partly work, however I think because the dataframes don't have the same size. I have included pictures of what the excel files look like. I apologize for my lack a knowledge, I am a beginner trying to help out the family business. Thank you for the help!

enter image description here

Please include, sample input & expected output.

sushanth
– sushanth

2020-07-11 15:54:34 +00:00
Commented Jul 11, 2020 at 15:54 — sushanth
– sushanth, Commented Jul 11, 2020 at 15:54

jb4earth · Accepted Answer · 2020-07-11 16:16:00Z

0

You can do most things in pandas without loops.

Try something like this:

import pandas as pd

df1 = pd.DataFrame({'A': ['A0'],
                     'B': ['B0'],
                     'C': ['C0'],
                     'D': ['D0']})

df2 = pd.DataFrame({'A': ['A0','','','',''],
                    'B': ['B1','B2', 'B3', 'B4', 'B5'],
                    'C': ['C0','','','',''],
                    'D': ['D1','D2', 'D3', 'D4', 'D5']})

pd.concat([df1, df2], axis=0, sort=False).T

answered Jul 11, 2020 at 16:16

jb4earth

1881 silver badge6 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Stephen Strosko · Accepted Answer · 2020-07-11 16:16:02Z

0

Typically you want to avoid using crude loops when using pandas. These are much slower and inefficient. The best method is to use the apply feature in pandas, documentation here. Here are a few examples on how to use apply, example 1, example 2, example 3.

answered Jul 11, 2020 at 16:16

Stephen Strosko

6771 gold badge8 silver badges20 bronze badges

Collectives™ on Stack Overflow

Replace rows in one dataframe with rows from a different dataframe through a loop

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related