26

Is there a built-in function or a very simple way of finding the index of n largest elements in a list or a numpy array?

K = [1,2,2,4,5,5,6,10]

Find the index of the largest 5 elements?

I count the duplicates more than once, and the output should be a list of the indices of those largest numbers

4
  • what is your expected output here? Commented Jun 2, 2013 at 0:47
  • 2
    are you counting duplicates more than once? Commented Jun 2, 2013 at 0:49
  • 2
    possible duplicate of How to get indices of N maximum values in a numpy array? Commented Jun 2, 2013 at 1:07
  • 1
    what about [1,2,2,4,5,10,5,6,10], what would be the output in this case? Commented Jun 2, 2013 at 1:09

4 Answers 4

45

Maybe something like:

>>> K
[4, 5, 1, 6, 2, 5, 2, 10]
>>> sorted(range(len(K)), key=lambda x: K[x])
[2, 4, 6, 0, 1, 5, 3, 7]
>>> sorted(range(len(K)), key=lambda x: K[x])[-5:]
[0, 1, 5, 3, 7]

or using numpy, you can use argsort:

>>> np.argsort(K)[-5:]
array([0, 1, 5, 3, 7])

argsort is also a method:

>>> K = np.array(K)
>>> K.argsort()[-5:]
array([0, 1, 5, 3, 7])
>>> K[K.argsort()[-5:]]
array([ 4,  5,  5,  6, 10])
Sign up to request clarification or add additional context in comments.

2 Comments

Alternatively import heapq;heapq.nlargest(n, range(len(K)), key=lambda x: K[x])
This is not optimal to sort it for very big list since it may take O(n^2), while you're supposed to solved this in a linear time.
4

Consider the following code,

 N=5
 K = [1,10,2,4,5,5,6,2]
 #store list in tmp to retrieve index
 tmp=list(K)
 #sort list so that largest elements are on the far right
 K.sort()
 #To get the 5 largest elements
 print K[-N:]
 #To get the 5th largest element
 print K[-N]
 #get index of the 5th largest element
 print tmp.index(K[-N])

If you wish to ignore duplicates, then use set() as follows,

 N=5
 K = [1,10,2,4,5,5,6,2]
 #store list in tmp to retrieve index
 tmp=list(K)
 #sort list so that largest elements are on the far right
 K.sort()
 #Putting the list to a set removes duplicates
 K=set(K)
 #change K back to list since set does not support indexing
 K=list(K)
 #To get the 5 largest elements
 print K[-N:]
 #To get the 5th largest element
 print K[-N]
 #get index of the 5th largest element
 print tmp.index(K[-N])

Hopefully one of them covers your question :)

Comments

1

For efficiency, numpy partition is much more efficient for large array. There is no need to sort fully.

"All elements smaller than the k-th element are moved before this element and all equal or greater are moved behind it. The ordering of the elements in the two partitions is undefined."

1 Comment

I was about to add this as an answer. What I think @cosine means is to use np.argpartition rather than np.argsort (as in @DSM's answer). So np.argpartition(K,-5)[-5:] returns [3, 4, 5, 6, 7], the indexes of the five largest values. Using those indexes on K, np.array(K)[ np.argpartition(K,-5)[-5:] ] returns [ 4, 5, 5, 6, 10], the actual 5 largest values (which you could also get directly using np.partition(K,-5)[-5:]).
0

import headq

Then use function nlargest()

1 Comment

Dear JimmyC, You're right, this is the best way to do it in python. Yet your answer received downvotes. I suppose it's because there are a few ways in which your answer could be improved: (1) You wrote headq instead of heapq. (2) Please write full English sentences. (3) Please give an example use. (4) You could also link to the documentation: docs.python.org/3/library/heapq.html#heapq.nlargest

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.