Merge multiple CSV file using python

Question

I would like to merge three CSV files as follow:

df = pd.DataFrame()
df["train_board_station"] = ['Tokyo','LA','Paris','New_York','Delhi']
df["train_off_station"] = ['Phoenix','London','Sydney','Berlin','Shanghai']

Second csv file:

ref = pd.DataFrame() 
ref["station"] = ['Tokyo','London','Paris','New_York','Shanghai','LA','Sydney','Berlin','Phoenix','Delhi','Tokyo','London','Paris','Sydney','Berlin']
ref["point_A"] = ['-34.54','56.789','-78,98','45.62','111.67','23.78','-98.40','-76.89','23.98','23.89']
ref["point_B"] = ['34.89','-78.55','78.89','34.12','56.56','23.23','-78.65','34.76','23.67','21.645']

Third csv file:

rec = pd.DataFrame()
rec["code"] = ['Tokyo','London','Paris','New_York','Shanghai','LA','Sydney','Berlin','Phoenix','Delhi']
rec["count_A"] = ['1.2','7.8','4','8','7.8','3','8','5','2','10']
rec["count_B] = ['12','78','4','8','78','36,'88,'51,'25,'10']

I tried this. But i get memory error:

for x in ["board", "off"]:
    df["station"] = df["train_" + x + "_station"]
    df["code"] = df["train_" + x + "_station"]
    df = pd.concat([df, ref,rec], axis=1, join_axes=[df.index])
    df[x + "_point_A"] = df["point_A"]
    df[x + "_point_B"] = df["point_B"]
    df[x + "_count_A"] = df["count_A"]
    df[x + "_count_B"] = df["count_B"]
    df = df.drop(["station", "point_A","point_B","code","count_A","count_B"], axis=1)

I get the memory error.

Because your for loop is trying to access df["train_" + x + "_station"] with x = board which is invalid. — ZdaR
– ZdaR, Commented May 3, 2017 at 8:55

jezrael · Accepted Answer · 2017-05-03 08:58:35Z

It seems you need df1 and df2 variables in loops:

for x in ["board", "off"]:
    df["station"] = df["train_" + x + "_station"]
    df1 = pd.concat([df, ref], axis=1, join_axes=[df.index])
    df[x + "_latitude"] = df1["latitude"]
    df[x + "_longitude"] = df1["longitude"]
    df = df.drop("station", axis=1)

for x in ["board", "off"]:
    df["code"] = df["train_" + x + "_station"]
    df2 = pd.concat([df, por], axis=1, join_axes=[df.index])
    df[x + "_freq"] = df2["freq"]
    df[x + "_count"] = df2["count"]
    df = df.drop(["code"], axis=1)

print (df)
  train_board_station train_off_station board_latitude board_longitude  \
0               Tokyo           Phoenix         -34.54           34.89   
1                  LA            London         56.789          -78.55   
2               Paris            Sydney         -78,98           78.89   
3            New_York            Berlin          45.62           34.12   
4               Delhi          Shanghai         111.67           56.56   

  off_latitude off_longitude board_freq board_count off_freq off_count  
0       -34.54         34.89        1.2          12      1.2        12  
1       56.789        -78.55        7.8          78      7.8        78  
2       -78,98         78.89          4           4        4         4  
3        45.62         34.12          8           8        8         8  
4       111.67         56.56        7.8          78      7.8        78

I have a small problem with this ! the files doesnt get merged if there exists two more more lines with same code name. only the first line with the code gets merged, if the second line has the same code name, it doesnt get merged. can you please help me to solve this problem ?
bteere is create new question, but I am working on solution.
i have another error also. please check : stackoverflow.com/questions/43977906/…

Collectives™ on Stack Overflow

Merge multiple CSV file using python

1 Answer 1

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Linked

Related