Suppose I have a 750x750 matrix placed in a DataFrame, say df.
df=
c1 c2 c3 ... c750
c1 5 2 5 ... 3
c2 3 1 5 ... 80
c3 4 2 7 ... 10
. . . . ... .
. . . . ... .
. . . . ... .
c750 8 3 5 ... 1
I want to find out the 4 highest-value containing column for each row, which I can easily do it by:
a = df.values
a.sort(axis=1)
sorted_table = a[:,-4::]
b = a[:,::-1]
However, the result I get is just a list, without the index and column name.
[[ 98. 29. 15. 10.]
[ 93. 91. 75. 60.]
[ 48. 21. 17. 10.]
.
.
.
...]
What should I do if I want to know which column name is the sorted-values referring to?
I would like to display:
df=
c1 c512 c20 c57 c310
c2 c317 c133 c584 c80
c3 c499 c289 c703 c100
. . . . ... .
. . . . ... .
. . . . ... .
c750 c89 c31 c546 c107
where
c512 is referring to 98
c20 is referring to 29
c57 is referring to 15
and so and so.
df.apply(myfunc, axis=1)instead ofdf.sort. This will allow you to manipulate the column names together with their values.