1

I currently have a MxN dataframe which contains a solution to an optimization problem. "Active" i,j for i in {M} and j in {N} pairs are represented by 1 and "inactive" pairs by 0. I need to determine i,j values for all active cells, hopefully avoiding a for loop over index or columns.

This would be an example:

In [73]: sol_df
Out[73]:

    1    2    3   4   5
1   0    0    1   0   0
2   1    0    0   0   0
3   0    1    0   0   0
4   0    0    0   0   0 

In this case, what I would require is a list of pairs (tuples would do):

[(1,3), (2,1), (3,2)]

Is there a way?

Thanks!

A.

EDIT: explanation was unclear EDIT2: my explanation was still unclear

6
  • Have you tried your_dataframe == 1? Or do you want the results as a list of tuples? Commented Jul 20, 2015 at 15:57
  • I think this is a duplicate of this question. Try df[df == 1].index.tolist(). Commented Jul 20, 2015 at 15:59
  • @Mauris, seems like that would work, I'll try it. tobias_k, seems like Mauris and you are on the same page. I'll ge back to you. Commented Jul 20, 2015 at 15:59
  • @Mauris, set(df[df == 1].index.tolist()) == set(df.index.tolist()) yields True. I'll try your first suggestion, which I think could be what I was looking for. Commented Jul 20, 2015 at 16:03
  • @misterte do you want the row and columns as tuple? Commented Jul 20, 2015 at 16:04

1 Answer 1

5
>>> import numpy
>>> a = numpy.array([[1, 0, 1], [0, 1, 1], [0, 1, 0]])
>>> numpy.transpose(numpy.nonzero(a))
array([[0, 0],
       [0, 2],
       [1, 1],
       [1, 2],
       [2, 1]])
Sign up to request clarification or add additional context in comments.

2 Comments

Exactly this. Thanks!
For anybody working with a dataframe, use df.as_matrix() first.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.