1

for example I got many sub-arrays by splitting one array A based on list B:

A = np.array([[1,1,1],
              [2,2,2],
              [2,3,4],
              [5,8,10],
              [5,9,9],
              [7,9,6],
              [1,1,1],
              [2,2,2],
              [9,2,4],
              [9,3,6],
              [10,3,3],
              [11,2,2]])
B = np.array([5,7])
C = np.split(A,B.cumsum()[:-1])
>>>print(C)
>>>array([[1,1,1],
          [1,2,2],
          [2,3,4],
          [5,8,10],
          [5,9,9]]),
   array([[7,9,6],
          [1,1,1],
          [2,2,2],
          [9,2,4],
          [9,3,6],
          [10,3,3],
          [11,2,2]])

How can I find get the rows only appeared once in all the sub-arrays (delete those who appeared twice)? so that I can get the result like: (because [1,1,1] and [2,2,2] appeared twice in C )

>>>array([[2,3,4],
          [5,8,10],
          [5,9,9]]),
   array([[7,9,6],
          [9,2,4],
          [9,3,6],
          [10,3,3],
          [11,2,2]])
2
  • is your 1,2,2 actually a 2,2,2? Commented May 10, 2022 at 14:56
  • oh yes sorry it's 2,2,2. but np.unique will also include 2,2,2 and 1,1,1 right? only delete one of each, won't delete two of each. Commented May 10, 2022 at 15:00

1 Answer 1

1

You can use np.unique to identify the duplicates:

_, i, c = np.unique(A, axis=0, return_index=True, return_counts=True)

idx = np.isin(np.arange(len(A)), i[c==1])

out = [a[i] for a,i in zip(np.split(A, B.cumsum()[:-1]),
                           np.split(idx, B.cumsum()[:-1]))]

output:

[array([[ 2,  3,  4],
        [ 5,  8, 10],
        [ 5,  9,  9]]),
 array([[ 7,  9,  6],
        [ 9,  2,  4],
        [ 9,  3,  6],
        [10,  3,  3],
        [11,  2,  2]])]
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.