Python Panda append dataframe in loop

Question

I am trying to append many data frames into one empty data frame but It is not working. For this, I am using this tutorial my code is like this:

I am generating a frame inside a loop for that my code is:

def loop_single_symbol(p1):
    i = 0
    delayedPrice = []
    symbol = [] 
    while i<5 :
        print(p1)
        h = get_symbol_data(p1)
        delayedPrice.append(h['delayedPrice']) 
        symbol.append(h['symbol'])
        i+=1
    df = pd.DataFrame([], columns = []) 
    df["delayedPrice"] = delayedPrice
    df["symbol"] = symbol
    df["time"] = get_nyc_time()
    return df 
    time.sleep(4)

This code is generating a frame like this:

   delayedPrice symbol time
0          30.5    BAC  6:6
1          30.5    BAC  6:6
2          30.5    BAC  6:6
3          30.5    BAC  6:6
4          30.5    BAC  6:6

And I am running a loop like this:

length = len(symbol_list())
data = ["BAC","AAPL"]
df = pd.DataFrame([], columns = []) 
for j in range(length): 
    u = data[j]
    if h:
        df_of_single_symbol = loop_single_symbol(u)
        print(df_of_single_symbol)
        df.append(df_of_single_symbol, ignore_index = True)        
print(df)

I am trying to append two or more data frame into one empty data frame but using the above code I am getting:

Empty DataFrame
Columns: []
Index: []

And I want a result like this:

   delayedPrice symbol time
0          30.5    BAC  6:6
1          30.5    BAC  6:6
2          30.5    BAC  6:6
3          30.5    BAC  6:6
4          30.5    BAC  6:6
0        209.15   AAPL  6:6
1        209.15   AAPL  6:6
2        209.15   AAPL  6:6
3        209.15   AAPL  6:6
4        209.15   AAPL  6:6

How can I do this using panda and what is the best possible way to do this.

Note: Here this line

h = get_symbol_data(p1)

Is fetching some data from API

Just like list.append, pd.DataFrame.append is not an in-place operation. You need to assign the appended dataframe back to df. — Chris
– Chris, Commented May 3, 2019 at 10:34
Pandas dataframes do not work as a list, they are much more complex data structures and appending is not really considered the best approach. Why not considering a dictionary, a file or even better a database to store the api fetches and visualise / process by converting your data into pandas? — qmeeus
– qmeeus, Commented May 3, 2019 at 10:35
one approach is to store the api output in a database, then model/update a column before reporting. — MEdwin
– MEdwin, Commented May 3, 2019 at 10:37
I can do that but in this, I want to append in the data frame. How can I append an empty data frame with new a frame which I am creating? — Nilay Singh
– Nilay Singh, Commented May 3, 2019 at 10:39

qmeeus · Accepted Answer · 2019-05-03 10:54:42Z

As I mentioned in my comment, appending to pandas dataframes is not considered a very good approach. Instead, I suggest that you use something more appropriate to store the data, such as a file or a database if you want scalability.

Then you can use pandas for what it's built, i.e. data analysis by just reading the contents of the database or the file into a dataframe.

Now, if you really want to stick with this approach, I suggest either join or concat to grow your dataframe as you get more data

[EDIT]

Example (from one of my scripts):

results = pd.DataFrame()
for result_file in result_files:
    df = parse_results(result_file)
    results = pd.concat([results, df], axis=0).reset_index(drop=True)

parse_results is a function that takes a filename and returns a dataframe formatted in the right way, up to you to make it fit your needs.

The answer will look better if you can demonstrate join or concat in my code

Demis · Accepted Answer · 2021-04-08 20:31:25Z

As the comments stated, your original error is that you didn't assign the df.append call to a variable - it returns the appended (new) DataFrame.

For anyone else looking to "extend" your DataFrame in-place (without an intermediate DB, List or Dictionary), here is a hint showing how to do this simply:

Pandas adding rows to df in loop

Basically, start with your empty DataFrame, already setup with the correct columns,

then use df.loc[ ] indexing to assign the new Row of data to the end of the dataframe, where len(df) will point just past the end of the DataFrame. It looks like this:

   df.loc[  len(df)  ] = ["my", "new", "data", "row"]

Collectives™ on Stack Overflow

Python Panda append dataframe in loop

2 Answers 2

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Linked

Related