0

I have a pandas dataframe with data like:

+-----------+-----------------+---------+
| JOB-NAME  |  Status         | SLA     |
+-----------+-----------------+---------+
| job_1     | YET_TO_START    | ---     |
| job_3     | COMPLETED       | MET     |
| job_4     | RUNNING         | MET     |
| job_2     | YET_TO_START    | LATE    |
| job_6     | RUNNING         | LATE    |
| job_5     | FAILED          | LATE    |
| job_7     | YET_TO_START    | ---     |
| job_8     | COMPLETED       | NOT_MET |
+-----------+-----------------+---------+

I need to sort this table based on the Status and SLA states, like for Status: FAILED will be top on the table, then YET_TO_START, then RUNNING, and finally COMPLETED. Similarly for SLA the order will be LATE, ---, NOT_MET, and MET. Like this:

+-----------+-----------------+---------+
| JOB-NAME  |  Status         | SLA     |
+-----------+-----------------+---------+
| job_5     | FAILED          | LATE    |
| job_2     | YET_TO_START    | LATE    |
| job_1     | YET_TO_START    | ---     |
| job_7     | YET_TO_START    | ---     |
| job_6     | RUNNING         | LATE    |
| job_4     | RUNNING         | MET     |
| job_8     | COMPLETED       | NOT_MET |
| job_3     | COMPLETED       | MET     |
+-----------+-----------------+---------+

I am able to do this custom sorting priority-based only on single column Status, but unable to do for multiple columns.

sort_order_dict = {"FAILED":0, "YET_TO_START":1, "RUNNING":2, "COMPLETED":3}
joined_df = joined_df.sort_values(by=['status'], key=lambda x: x.map(sort_order_dict))

A solution is given here, but its for single column, not multiple column.

0

2 Answers 2

5

You can extend dictionary by values from another columns, only necessary different keys in both columns for correct working like mentioned mozway in comments:

sort_order_dict = {"FAILED":0, "YET_TO_START":1, "RUNNING":2, "COMPLETED":3,
                   "LATE":4, "---":5, "NOT_MET":6, "MET":7}
df = df.sort_values(by=['Status','SLA'], key=lambda x: x.map(sort_order_dict))
print (df)
  JOB-NAME        Status      SLA
5    job_5        FAILED     LATE
3    job_2  YET_TO_START     LATE
0    job_1  YET_TO_START      ---
6    job_7  YET_TO_START      ---
4    job_6       RUNNING     LATE
2    job_4       RUNNING      MET
7    job_8     COMPLETED  NOT_MET
1    job_3     COMPLETED      MET

Or use ordered Categorical:

df['Status'] = pd.Categorical(df['Status'], ordered=True, 
                              categories=['FAILED', 'YET_TO_START', 'RUNNING', 'COMPLETED'])
                                          
df['SLA'] = pd.Categorical(df['SLA'], ordered=True, 
                              categories= ['LATE', '---', 'NOT_MET', 'MET'])
df = df.sort_values(by=['Status','SLA'])
print (df)
  JOB-NAME        Status      SLA
5    job_5        FAILED     LATE
3    job_2  YET_TO_START     LATE
0    job_1  YET_TO_START      ---
6    job_7  YET_TO_START      ---
4    job_6       RUNNING     LATE
2    job_4       RUNNING      MET
7    job_8     COMPLETED  NOT_MET
1    job_3     COMPLETED      MET
Sign up to request clarification or add additional context in comments.

3 Comments

I like the idea of using a single dictionary, but the caveat is that if you have common items in both columns and want a different order, you won't be able to have duplicate keys.
@mozway - Then is possible use another solution. Also obiously different values.
as I said, I like it, but the caveat needs to be known IMO.
3

Use numpy.lexsort, you can use any number of parameters easily:

import numpy as np

sort_order_dict = {"FAILED":0, "YET_TO_START":1, "RUNNING":2, "COMPLETED":3}
sort_order_dict2 = {'LATE': 0, '---': 1, 'NOT_MET': 2, 'MET': 3}

order = np.lexsort([df['SLA'].map(sort_order_dict2),
                    df['Status'].map(sort_order_dict),
                    ])

out = df.iloc[order]

Output:

  JOB-NAME        Status      SLA
5    job_5        FAILED     LATE
3    job_2  YET_TO_START     LATE
0    job_1  YET_TO_START      ---
6    job_7  YET_TO_START      ---
4    job_6       RUNNING     LATE
2    job_4       RUNNING      MET
7    job_8     COMPLETED  NOT_MET
1    job_3     COMPLETED      MET

2 Comments

Great solution! A small tip is that the primary key that needs to be sorted should be placed at the end.
@Jeremy yes, lexsort reverses the order of the key. In fact it's logical in terms of the sorting steps assuming successive sorts, the last sorting step is the significant one.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.