11

I'm trying to find out a way how I can select rows in pandas dataframe based that some values will be in my list. For example

df = pd.DataFrame(np.arange(6).reshape(3,2), columns=['A','B'])
   A  B
0  0  1
1  2  3
2  4  5

I know that I can select certain row, e.g.

df[df.A==0]

will select me row with A=0. What I want is to select multiple rows whose values will be in my list, e.g. A in [0,2]. I tried

df[df.A in [0,2]]
df[list(df.A)==[0,2]]

but nothing works. In R language I can provide %in% operator. In python syntax we can use A in [0,2], etc. How I can select subset of rows in pandas in this case? Thanks, Valentin.

0

2 Answers 2

30

pd.isin() will select multiple values:

>>> df[df.A.isin([0,2])]
   A  B
0  0  1
1  2  3
Sign up to request clarification or add additional context in comments.

8 Comments

Brian, thanks that works. How about negation operation, i.e, not in?
You can use numpy's logical_not: df[np.logical_not(df.A.isin([0,2]))]
Great, that's answer completely my question.
Is there a way to use pandas df.loc , iloc or ix to do this?
I don't think so, unless you are 'cheating' by knowing the which rows you are looking for. (In this example, df.iloc[0:2] (1st and 2nd rows) and df.loc[0:1] (rows with index value in the range of 0-1 (the index being unlabeled column on the left) both give you the equivalent output, but you had to know in advance. If you want a different syntax, there is a df.query() method.
|
3

if you don't like that syntax, you can use also use query (introduced in pandas 0.13 which is from 2014):

>>> df.query('A in [0,2]')
   A  B
0  0  1
1  2  3

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.