Pandas: Creating empty dataframe in for loop, appending

Question

I would like to create a ((25520*43),3) pandas Dataframe in a for loop.

I created the dataframe like:

lst=['Region', 'GeneID', 'DistanceValue']

df=pd.DataFrame(index=lst).T

And now I want to fill 'Region', 43 times with 25520 values. Also GeneID and DistanceValue.

This is my for loop for that:

for i in range(43):
    df.DistanceValue = np.sort(distance[i,:])
    df.Region = np.ones(25520) * i
    args = np.argsort(distance[i,:])
    df.GeneID = ids[int(args[i])]

But than my df exists just of (25520, 3). So I just have the last iteration for 43 filled in. How can I concat all iteration one to 43 in my df?

ManojK · Accepted Answer · 2020-04-01 07:53:52Z

I can't reproduce your example but there are couple of corrections you can make:

lst=['Region', 'GeneID', 'DistanceValue']
df=pd.DataFrame(index=lst).T

region = []
for i in range(43):
    region.append(np.ones(25520))

flat_list = [item for sublist in region for item in sublist]
df.Region = flat_list

First create a new list outside loop and then append values within loop in this list. The flat_list will consolidate all 43 lists to one and then you can map it to the DataFrame. It is always easier to fill DataFrame values outside of loop.

Similarly you can update all 3 columns.

Collectives™ on Stack Overflow

Pandas: Creating empty dataframe in for loop, appending

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related