1

I would like to create a ((25520*43),3) pandas Dataframe in a for loop.

I created the dataframe like:

lst=['Region', 'GeneID', 'DistanceValue']

df=pd.DataFrame(index=lst).T

And now I want to fill 'Region', 43 times with 25520 values. Also GeneID and DistanceValue.

This is my for loop for that:

for i in range(43):
    df.DistanceValue = np.sort(distance[i,:])
    df.Region = np.ones(25520) * i
    args = np.argsort(distance[i,:])
    df.GeneID = ids[int(args[i])]

But than my df exists just of (25520, 3). So I just have the last iteration for 43 filled in. How can I concat all iteration one to 43 in my df?

1 Answer 1

1

I can't reproduce your example but there are couple of corrections you can make:

lst=['Region', 'GeneID', 'DistanceValue']
df=pd.DataFrame(index=lst).T

region = []
for i in range(43):
    region.append(np.ones(25520))

flat_list = [item for sublist in region for item in sublist]
df.Region = flat_list

First create a new list outside loop and then append values within loop in this list. The flat_list will consolidate all 43 lists to one and then you can map it to the DataFrame. It is always easier to fill DataFrame values outside of loop.

Similarly you can update all 3 columns.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.