0

I am trying to append many data frames into one empty data frame but It is not working. For this, I am using this tutorial my code is like this:

I am generating a frame inside a loop for that my code is:

def loop_single_symbol(p1):
    i = 0
    delayedPrice = []
    symbol = [] 
    while i<5 :
        print(p1)
        h = get_symbol_data(p1)
        delayedPrice.append(h['delayedPrice']) 
        symbol.append(h['symbol'])
        i+=1
    df = pd.DataFrame([], columns = []) 
    df["delayedPrice"] = delayedPrice
    df["symbol"] = symbol
    df["time"] = get_nyc_time()
    return df 
    time.sleep(4) 

This code is generating a frame like this:

   delayedPrice symbol time
0          30.5    BAC  6:6
1          30.5    BAC  6:6
2          30.5    BAC  6:6
3          30.5    BAC  6:6
4          30.5    BAC  6:6

And I am running a loop like this:

length = len(symbol_list())
data = ["BAC","AAPL"]
df = pd.DataFrame([], columns = []) 
for j in range(length): 
    u = data[j]
    if h:
        df_of_single_symbol = loop_single_symbol(u)
        print(df_of_single_symbol)
        df.append(df_of_single_symbol, ignore_index = True)        
print(df)

I am trying to append two or more data frame into one empty data frame but using the above code I am getting:

Empty DataFrame
Columns: []
Index: []

And I want a result like this:

   delayedPrice symbol time
0          30.5    BAC  6:6
1          30.5    BAC  6:6
2          30.5    BAC  6:6
3          30.5    BAC  6:6
4          30.5    BAC  6:6
0        209.15   AAPL  6:6
1        209.15   AAPL  6:6
2        209.15   AAPL  6:6
3        209.15   AAPL  6:6
4        209.15   AAPL  6:6

How can I do this using panda and what is the best possible way to do this.

Note: Here this line

h = get_symbol_data(p1)

Is fetching some data from API

5
  • 2
    Just like list.append, pd.DataFrame.append is not an in-place operation. You need to assign the appended dataframe back to df. Commented May 3, 2019 at 10:34
  • Pandas dataframes do not work as a list, they are much more complex data structures and appending is not really considered the best approach. Why not considering a dictionary, a file or even better a database to store the api fetches and visualise / process by converting your data into pandas? Commented May 3, 2019 at 10:35
  • one approach is to store the api output in a database, then model/update a column before reporting. Commented May 3, 2019 at 10:37
  • I can do that but in this, I want to append in the data frame. How can I append an empty data frame with new a frame which I am creating? Commented May 3, 2019 at 10:39
  • see my answer. In short: join or pd.concat will do Commented May 3, 2019 at 10:41

2 Answers 2

5

As I mentioned in my comment, appending to pandas dataframes is not considered a very good approach. Instead, I suggest that you use something more appropriate to store the data, such as a file or a database if you want scalability.

Then you can use pandas for what it's built, i.e. data analysis by just reading the contents of the database or the file into a dataframe.

Now, if you really want to stick with this approach, I suggest either join or concat to grow your dataframe as you get more data

[EDIT]

Example (from one of my scripts):

results = pd.DataFrame()
for result_file in result_files:
    df = parse_results(result_file)
    results = pd.concat([results, df], axis=0).reset_index(drop=True)

parse_results is a function that takes a filename and returns a dataframe formatted in the right way, up to you to make it fit your needs.

Sign up to request clarification or add additional context in comments.

2 Comments

The answer will look better if you can demonstrate join or concat in my code
1

As the comments stated, your original error is that you didn't assign the df.append call to a variable - it returns the appended (new) DataFrame.

For anyone else looking to "extend" your DataFrame in-place (without an intermediate DB, List or Dictionary), here is a hint showing how to do this simply:

Pandas adding rows to df in loop

Basically, start with your empty DataFrame, already setup with the correct columns,

then use df.loc[ ] indexing to assign the new Row of data to the end of the dataframe, where len(df) will point just past the end of the DataFrame. It looks like this:

   df.loc[  len(df)  ] = ["my", "new", "data", "row"]

More detail in the linked hint.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.