The documentation for the CSR format says:
Disadvantages of the CSR format: slow column slicing operations (consider CSC)
And the documentation for the sparse array module says:
All conversions among the CSR, CSC, and COO formats are efficient, linear-time operations.
So why not convert CSR to CSC and then carry out your column filtering operation?
Update: to take advantage of CSC format, obviously you'd have to rewrite your column-filtering operation. The idea is to operate on the CSC representation of the sparse matrix (and not just on its abstract representation as an array), which is documented as follows:
the row indices for column
iare stored inindices[indptr[i]:indptr[i+1]]and their corresponding values are stored indata[indptr[i]:indptr[i+1]].
So instead of filtering each column separately (as you do in the code in the post), you should filter the whole data array in one go, like this:
# m is your dataset in CSC format -- filter the data values
filtered_data = m.data < v["threshold"]
# construct a new sparse array like m but using the filtered data
f = scipy.sparse.csc_matrix((filtered_data, m.indices, m.indptr), shape=m.shape)
# mask indicating which columns have all values below filter
cols = f.max(axis=0) == 0
(Maybe this wasn't obvious. But digging into the representation is often the way that you have to work with sparse compressed matrices.)