I have an Numpy Array and wish to output from it the unique rows based on the value of the first element in each row of the array. I can get partial success in getting the first values of the unique rows but not the full row, e.g.
dataA = np.array([(107., 7.475729, 6.573791, 90.0126 , 0.5529882, 0.867588 ),
(107., 7.408565, 6.38974 , 89.97312, 0.553728 , 0.8670179),
(108., 7.838725, 6.961871, 89.52572, 0.5610707, 0.7769735),
(108., 7.795123, 7.054095, 89.62989, 0.5592708, 0.7742778),
(109., 7.079929, 6.86194 , 89.6181 , 0.5660294, 0.8596874),
(109., 7.058383, 6.671512, 89.52995, 0.5663874, 0.8610857)])
print('Original Array :' , dataA)
# Get unique values from complete 2D array
uniqueValues = np.unique(dataA)
print('Unique Values : ', uniqueValues)
# Get unique rows from numpy array
uniqueRows = np.unique(dataA[:,0], axis=0)
print('Unique Rows : ', uniqueRows, sep='\n')
This gives:
Unique Rows :
[107. 108. 109.]
desired results:
[(107., 7.475729, 6.573791, 90.0126 , 0.5529882, 0.867588 ),
(108., 7.838725, 6.961871, 89.52572, 0.5610707, 0.7769735),
(109., 7.079929, 6.86194 , 89.6181 , 0.5660294, 0.8596874)])
Even though the above works to the point that it will give me the row ID's it seems to fail when I have nan's
dataA = np.array([(107., 7.475729, 6.573791, 90.0126 , 0.5529882, 0.867588 , nan, nan)
(107., 7.408565, 6.38974 , 89.97312, 0.553728 , 0.8670179, nan, nan)
(108., 7.838725, 6.961871, 89.52572, 0.5610707, 0.7769735, nan, nan)
(108., 7.795123, 7.054095, 89.62989, 0.5592708, 0.7742778, nan, nan)
(109., 7.079929, 6.86194 , 89.6181 , 0.5660294, 0.8596874, nan, nan)
(109., 7.058383, 6.671512, 89.52995, 0.5663874, 0.8610857, nan, nan)
(110., 7.727924, 7.116364, 90.45003, 0.5366358, 0.8887361, nan, nan)
(110., 7.748454, 7.223625, 90.6782 , 0.5349852, 0.8855141, nan, nan)])
np.unique(dataA[:,0], axis=0)doesn't give you unique rows, but unique values in the first column.