0

How to I sperate rows and form a new dataframe with the series ?

Suppose I have a dataframe df and I am iterating over df with the following and trying to append over an empty dataframe

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                    columns=['a', 'b', 'c', 'd', 'e'])

df1 = pd.DataFrame()
df2 = pd.DataFrame()

for index,row in df.iterrows():
    if (few conditions goes here):
        df1.append(row)
    else:
        df2.append(row)

the type of each rows over iteration is a series, but if I append it to empty dataframe it appends rows as columns and columns as row. Is there a fix for this ?

2 Answers 2

2

I think the best is avoid iterating and use boolean indexing with conditions chained by & for AND, | for OR, ~ for NOT and ^ for XOR:

#define all conditions
mask = (df['a'] > 2) & (df['b'] > 3)
#filter
df1 = df[mask]
#invert condition by ~
df2 = df[~mask]

Sample:

np.random.seed(125)
df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                    columns=['a', 'b', 'c', 'd', 'e'])
print (df)
   a  b  c  d  e
0  2  7  3  6  0
1  5  6  2  5  0
2  4  2  9  0  7
3  2  7  9  5  3
4  5  7  9  9  1

mask = (df['a'] > 2) & (df['b'] > 3)
print (mask)
0    False
1     True
2    False
3    False
4     True


df1 = df[mask]
print (df1)
   a  b  c  d  e
1  5  6  2  5  0
4  5  7  9  9  1

df2 = df[~mask]
print (df2)
   a  b  c  d  e
0  2  7  3  6  0
2  4  2  9  0  7
3  2  7  9  5  3

EDIT:

Loop version, if possible dont use it because slow:

df1 = pd.DataFrame(columns=df.columns)
df2 = pd.DataFrame(columns=df.columns)

for index,row in df.iterrows():
    if (row['a'] > 2) and (row['b'] > 3):
       df1.loc[index] = row
    else:
       df2.loc[index] = row


print (df1)
   a  b  c  d  e
1  5  6  2  5  0
4  5  7  9  9  1

print (df2)
   a  b  c  d  e
0  2  7  3  6  0
2  4  2  9  0  7
3  2  7  9  5  3
Sign up to request clarification or add additional context in comments.

2 Comments

Yea but the problem is I have many complex conditions on different columns where I am actually iterating 2 dataframes. After all the filters, I get these rows on the else block. So, is there any other way ?
I add loop solution also.
1

Try the query method

df2 = df1.query('conditions go here')

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.