2

I have a list made up of arrays. All have shape (2,).

Minimum example: mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]

I would like to get a unique list, e.g. [np.array([1,2]),np.array([3,4])]

or perhaps even better, a dict with counts, e.g. {np.array([1,2]) : 2, np.array([3,4]) : 1}

So far I tried list(set(mylist)), but the error is TypeError: unhashable type: 'numpy.ndarray'

4 Answers 4

4

As the error indicates, NumPy arrays aren't hashable. You can turn them to tuples, which are hashable and build a collections.Counter from the result:

from collections import Counter

Counter(map(tuple,mylist))
# Counter({(1, 2): 2, (3, 4): 1})

If you wanted a list of unique tuples, you could construct a set:

set(map(tuple,mylist))
# {(1, 2), (3, 4)}
Sign up to request clarification or add additional context in comments.

3 Comments

This answer is also good, but I accepted another one as that's what I decided to use in the end.
If you have a list of equally shaped arrays, do construct an array from it, so that you can leverage numpy's functions. np.array(mylist), and hence have a much better performance. Since you posted a list in the question, I assumed the inner arrays could be different in shape @cddt
Thank you for the comprehensive answer. You are correct, I should have converted the list to an array.
2

In general, the best option is to use np.unique method with custom parameters

u, idx, counts = np.unique(X, axis=0, return_index=True, return_counts=True)

Then, according to documentation:

  • u is an array of unique arrays
  • idx is the indices of the X that give the unique values
  • counts is the number of times each unique item appears in X

If you need a dictionary, you can't store hashable values in its keys, so you might like to store them as tuples like in @yatu's answer or like this:

dict(zip([tuple(n) for n in u], counts))

Comments

1

Pure numpy approach:

numpy.unique(mylist, axis=0)

which produces a 2d array with your unique arrays in rows:

numpy.array([
 [1 2],
 [3 4]])

Works if all your arrays have same length (like in your example). This solution can be useful depending on what you do earlier in your code: perhaps you would not need to get into plain Python at all, but stick to numpy instead, which should be faster.

Comments

0

Use the following:

import numpy as np
mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]
np.unique(mylist, axis=0)

This gives out list of uniques arrays.

array([[1, 2],
       [3, 4]])

Source: https://numpy.org/devdocs/user/absolute_beginners.html#how-to-get-unique-items-and-counts

1 Comment

You might want to mention that this will internally construct a 2d array from the list, which will fail if one of inner arrays has a different length

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.