2

I have a Numpy array that is created as follows

data=np.zeros(500,dtype='float32, (50000,2)float32')

This array is filled with values that I acquire from some measurements, and is supposed to reflect that during each time point (room for 500 time points) we can acquire 50.000 x- and y- coords.

Later in my code is use a bisect-like search for which I need to know howmany X-coords (measurement points) are actually in my array which I originally did with np.count_nonzero(data), this yielded the following problem:

Fake data:

1 1
2 2
3 0
4 4
5 0
6 6
7 7
8 8
9 9
10 10

the non zero count returns 18 values here, the code then goes into the bisect-like search using data[time][1][0][0] as min X-coord and data[time][1][(np.count_nonzero(data)][0] as max x-coord which results in the array stopping at 9 instead of 10.

I could use a while loop to manually count non-zero values (in the X-coord column) in the array but that would be silly, I assume that there is some builtin numpy functionality for this. My question is then what builtin functionality or modification of my np.count_nonzero(data) I need since the documentation doesn't offer much information in that regards (link to numpy doc).

-- Simplified question --

Can I use Numpy functionality to count the non-zero values for a singular column only? (i.e. between data[time][1][0][0] and data[time][1][max][0] )

3 Answers 3

2

Maybe a better approach would be to filter the array using nonzero and iterate over the result:

nonZeroData = data[np.nonzero(data[time][1])]

To count zeros only from the second column:

nonZeroYCount = np.count_nonzero(data[time][1][:, 1])
Sign up to request clarification or add additional context in comments.

4 Comments

Wouldn't that yield exactly the same problem and skip all Y-coords which are 0?
Not sure if I understand your problem. nonzero's purpose is to skip 0 elements - isn't this what you wanted? If you condition is more elaborate you can use where.
I need to retain all datapoints where the X-coord has a normal value but the Y-coord is 0. The approach that I used (and I think this applies to the one you suggest as well) would then skip the empty Y-coord, while I need the number of non-zero X-coords. (so those in data[time][1][0:unknown][0])
The second part of your answer c(sh)ould be what I was looking for indeed, going to try it for a bit.
1

If I understand you correctly, to select elements from data[time][1][0][0] to data[time][1][max][0]:

data[time][1][:max+1,0]

EDIT:

To count all non-zero for every time:

(data["f1"][:,:,0] != 0).sum(1)

3 Comments

If I get OP right, no need to max+1, np.count_nonzero(data[time][1][:,0]) is what he wants
@alko You are correct, I ended up using BartoszKP's answer with that modification.
@BasJansen, I added a method that may count all non-zero for every time without for loop. Does this works for you?
0

Why not consider using data != 0 to get the bool matrix?

You can use:

stat = sum(data != 0) to count the non-zero entries.

I am not sure what shape your data array has but hope you can see what I mean. :)

1 Comment

the question was how not to do what you are suggesting, since this would lead to exactly the same problem. I also wished to remain within the numpy library if possible since it has to count a numpy array.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.