I have a data frame as below.
In [23]: data2 = [{'a': 'x', 'b': 'y','c':'q'}, {'a': 'x', 'b': 'p', 'c': 'q'}, {'a':'p', 'b':'q'},{'a':'q', 'b':'y','c':'q'}]
In [26]: df = pd.DataFrame(data2)
In [27]: df
Out[27]:
a b c
0 x y q
1 x p q
2 p q NaN
3 q y q
I want to do boolean indexing to filter out columns which have either x or y. This i am doing as
In [29]: df[df['a'].isin(['x','y']) | (df['b'].isin(['x','y']))]
Out[29]:
a b c
0 x y q
1 x p q
3 q y q
But i have over 50 columns in which i need to check and checking each columns seems not very pythonic. I tried
In [30]: df[df[['a','b']].isin(['x','y'])]
But the output is not what i expect, i get the below
Out[30]:
a b c
0 x y NaN
1 x NaN NaN
2 NaN NaN NaN
3 NaN y NaN
I can drop rows which are all NaN but the values are missing in the rest.
For example in row-0 columns-c is NaN but i need that value.
Any suggestions how to do this ?