0

Trying to loop something and have the results appended to a Pandas df I created. The goal is to have the df contain all the results from the loop.

I am having trouble getting the pd.append to work. Right now it does not seems to append, but overwrite the existing row and I'm just left with the last row of the loop. I know all the data is correct since I can print it out in the loop and I see the correct values. Hopefully missing something simple.

for year in dfClose['year'].unique():
        tempYearDF = dfClose[dfClose['year'] == year]
        for i in dfClose['month'].unique():
            tempOpenDF = tempYearDF.loc[tempYearDF["month"] == i, "open"]
            tempCloseDF = tempYearDF.loc[tempYearDF["month"] == i, "close"]
            # If statement below is stopping loops on months that hasnt happened yet for the latest year.
            if len(tempOpenDF) > 0:
                othernumpyopen = tempOpenDF.to_numpy()
                othernumpyclose = tempCloseDF.to_numpy()
                aroundOpen = np.around(othernumpyopen[0],3)
                aroundClose = np.around(othernumpyclose[-1],3)
                month_pd = pd.DataFrame (columns=["YEAR", "MONTH", "MONTH OPEN", "MONTH CLOSE"])
                month_pd = month_pd.append({'YEAR' : year , 'MONTH' : i , 'MONTH OPEN' : aroundOpen , "MONTH CLOSE" : aroundClose} , ignore_index=True)

What I'm left with after executing. I am trying to add all the rows to the df.

    YEAR    MONTH   MONTH OPEN  MONTH CLOSE
0   2020.0  4.0 246.5   286.69

Example output from when I add the print to the loop.

     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    1.0      296.24       309.51
     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    2.0       304.3       273.36
     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    3.0      282.28       254.29
     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    4.0       246.5       286.69

Example of dfClose if you need it

    open    year    month   day date
0   30.490000   2010    1   4   2010-01-04
1   30.657143   2010    1   5   2010-01-05
2   30.625713   2010    1   6   2010-01-06
3   30.250000   2010    1   7   2010-01-07
4   30.042856   2010    1   8   2010-01-08


open     float64
close    float64
year       int64
month      int64
day        int64
date      object
dtype: object

1 Answer 1

1

You are redefining month_pd each time in the loop, overwriting the previous versions. Have a list of dataframes that you concatenate at the end.

dfs = []
for year in dfClose['year'].unique():
        tempYearDF = dfClose[dfClose['year'] == year]
        for i in dfClose['month'].unique():
            tempOpenDF = tempYearDF.loc[tempYearDF["month"] == i, "open"]
            tempCloseDF = tempYearDF.loc[tempYearDF["month"] == i, "close"]
            # If statement below is stopping loops on months that hasnt happened yet for the latest year.
            if len(tempOpenDF) > 0:
                othernumpyopen = tempOpenDF.to_numpy()
                othernumpyclose = tempCloseDF.to_numpy()
                aroundOpen = np.around(othernumpyopen[0],3)
                aroundClose = np.around(othernumpyclose[-1],3)
                dfs.append(pd.DataFrame({'YEAR' : year , 'MONTH' : i , 'MONTH OPEN' : aroundOpen , "MONTH CLOSE" : aroundClose}))

pd.concat(dfs)
Sign up to request clarification or add additional context in comments.

1 Comment

Awesome, yes this worked. I just had to add a index to the end of your append since it was throwing a scaler you must pass index error

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.