From an array like db (which will be approximately (1e6, 300)) and a mask = [1, 0, 1] vector, I define the target as a 1 in the first column.
I want to create an out vector that consists of ones where the corresponding row in db matches the mask and target==1, and zeros everywhere else.
db = np.array([ # out for mask = [1, 0, 1]
# target, vector #
[1, 1, 0, 1], # 1
[0, 1, 1, 1], # 0 (fit to mask but target == 0)
[0, 0, 1, 0], # 0
[1, 1, 0, 1], # 1
[0, 1, 1, 0], # 0
[1, 0, 0, 0], # 0
])
I have defined a vline function that applies a mask to each array line using np.array_equal(mask, mask & vector) to check that vectors 101 and 111 fit the mask, then retains only the indices where target == 1.
out is initialized to array([0, 0, 0, 0, 0, 0])
out = [0, 0, 0, 0, 0, 0]
The vline function is defined as:
def vline(idx, mask):
line = db[idx]
target, vector = line[0], line[1:]
if np.array_equal(mask, mask & vector):
if target == 1:
out[idx] = 1
I get the correct result by applying this function line-by-line in a for loop:
def check_mask(db, out, mask=[1, 0, 1]):
# idx_db to iterate over db lines without enumerate
for idx in np.arange(db.shape[0]):
vline(idx, mask=mask)
return out
assert check_mask(db, out, [1, 0, 1]) == [1, 0, 0, 1, 0, 0] # it works !
Now I want to vectorize vline by creating a ufunc:
ufunc_vline = np.frompyfunc(vline, 2, 1)
out = [0, 0, 0, 0, 0, 0]
ufunc_vline(db, [1, 0, 1])
print out
But the ufunc complains about broadcasting inputs with those shapes:
In [217]: ufunc_vline(db, [1, 0, 1])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-217-9008ebeb6aa1> in <module>()
----> 1 ufunc_vline(db, [1, 0, 1])
ValueError: operands could not be broadcast together with shapes (6,4) (3,)
In [218]: