0

I am passing in a single dataframe for performing various other data cleansing processes. While doing so, one of the process I am unable to complete without having another dataframe.

data= {'ID':[1,2], '2020-11-01' :[10,15], '2020-11-02':[43,35]}
df1 = pd.DataFrame.from_dict(data)
df1.head()


    ID  2020-11-01  2020-11-02
0   1   10  43
1   2   15  35

I would need to convert those dates as rows so used melt

df2 = df1.melt(id_vars = ["ID"], var_name = "ReportDate", value_name= "Units")
df2.head()

    ID  ReportDate  Units
0   1   2020-11-01  10
1   2   2020-11-01  15
2   1   2020-11-02  43
3   2   2020-11-02  35

Now I need to drop everything from df1 and need to capture the df2 details to df1.

I tried to drop all columns from df1(using inplace=True) and then do

df1["ID"] = df2["ID"]
df1["ReportDate"] = df2["ReportDate"]
df1["Units] = df2[Units]
df1.head()

    ID  ReportDate  Units
0   1   2020-11-01  10
1   2   2020-11-01  15

But I ended up with only 2 rows since the previous shape of df1 was 2x3

I need my output to look like

df1.head()

    ID  ReportDate  Units
0   1   2020-11-01  10
1   2   2020-11-01  15
2   1   2020-11-02  43
3   2   2020-11-02  35

How do I get df1 to have all the contents of df2?

5
  • What should your final df1 look like? Please show as if you called f1.head() as you show the earlier stages. Commented Nov 5, 2020 at 19:33
  • Why does df1 = df2 not satisfy your requirements? or import copy df1 = copy.deepcopy(df2) Commented Nov 5, 2020 at 20:18
  • @noah I have updated with head and how my result should look like Commented Nov 5, 2020 at 20:37
  • @piterbarg because that doesnt update the existing df1, it create a new dataframe object and am unable to pass on my class. I do all my changes to only one dataframe using inplace=true, I haven't done copy part, lemme try that Commented Nov 5, 2020 at 20:38
  • I see. Here is what seems to be a relevant discussion stackoverflow.com/questions/39783570/… Commented Nov 5, 2020 at 20:46

1 Answer 1

1

I understand the objective is to assign the content of df2 to df1 while making sure that id(df1) does not change through this operation. This seems to do it but probably not the most elegant way. Main difference from what you tried is dropping the index as well as columns

df1.drop(df1.columns, axis=1, inplace=True)
df1.drop(df1.index,  inplace=True)
df1[df2.columns] = df2[df2.columns]
df1.head()

it maybe better design to have a function process_data that can be used as such

df1 = process_data(df1)

then df1 can be changed inside your function but when returned from the function it is assigned to the same variable

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, I didnt know I could drop the indexes as well, thank you. I will try to have a function incorporated in the existing design.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.