Sorting value by two columns in Pandas Python

Question

The idea is to sort value by two columns. Such that, given two column, I am expecting the output something like

Expected output

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
3   NaN   5.0
4  10.0   NaN
5  24.0  24.7
6  31.0  31.4

However, using the code below

import pandas as pd
import numpy as np
df1 = pd.DataFrame ( {'x': [2, 3, 4, 24, 31, '',10],
                      'y':['','',4.1,24.7,31.4,5,'']} )
df1.replace(r'^\s*$', np.nan, regex=True,inplace=True)
rslt_df = df1.sort_values ( by=['x', 'y'], ascending=(True, True) )

print(rslt_df)

Produce the following

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
6  10.0   NaN
3  24.0  24.7
4  31.0  31.4
5   NaN   5.0

Notice that at the last row, the 5.0 of column y is placed at the bottom.

May I know what modification to the code in order to obtained the intended output?

The reason why is because it's sorting by X (nan goes to bottom), then Y. — ifly6
– ifly6, Commented Jun 17, 2021 at 16:43

halfer · Accepted Answer · 2021-12-14 21:18:01Z

3

Try sorting by x fillna y, then reindex from those sorted values:

df1.reindex(df1['x'].fillna(df1['y']).sort_values().index).reset_index(drop=True)

To update the df1 variable:

df1 = (
    df1.reindex(df1['x'].fillna(df1['y']).sort_values().index)
        .reset_index(drop=True)
)

df1:

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
3   NaN   5.0
4  10.0   NaN
5  24.0  24.7
6  31.0  31.4

edited Dec 14, 2021 at 21:18

halfer

20.2k19 gold badges110 silver badges207 bronze badges

answered Jun 17, 2021 at 16:43

Henry Ecker♦

35.8k19 gold badges48 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

anky · Accepted Answer · 2021-06-17 16:55:15Z

2

with np.sort and argsort:

df1.iloc[np.sort(df1[['x','y']],axis=1)[:,0].argsort()]

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
5   NaN   5.0
6  10.0   NaN
3  24.0  24.7
4  31.0  31.4

answered Jun 17, 2021 at 16:55

anky

75.3k11 gold badges46 silver badges76 bronze badges

2 Comments

rpb Over a year ago

This does exactly what the OP intend to with the advantage of being more compact.

rpb Over a year ago

@HenryEcker, I think you should maintain your post. I learn somehting from there.

Collectives™ on Stack Overflow

Sorting value by two columns in Pandas Python

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related