Querying MultiIndex DataFrame in Pandas

Question

I have a DataFrame that looks like this:

FirstDF=
              C
A    B      
'a' 'blue'   43
    'green'  59
'b' 'red     56
'c' 'green'  80
    'orange' 72

Where A and B are set as indexes. I also have a DataFrame that looks like:

SecondDF=

    A     B
0  'a'  'green'
1  'b'  'red'
2  'c'  'green'

Is there a way I can directly query the first DataFrame with the last one, and obtain an output like the following?

I did it by iterating over the second DataFrame, as shown below, but I would like to do it using pandas logic instead of for loops.

data=[]
for i in range(SecondDF.shape[0]):
    data.append(FirstDF.loc[tuple(SecondDF.iloc[i])])
data=pd.Series(data)

jezrael · Accepted Answer · 2018-10-22 05:48:10Z

Use merge with parameter left_index and right_on:

df = FirstDF.merge(SecondDF, left_index=True, right_on=['A','B'])['C'].to_frame()
print (df)
    C
0  59
1  56
2  80

Another solution with isin of MultiIndexes and filtering by boolean indexing:

mask = FirstDF.index.isin(SecondDF.set_index(['A','B']).index)
#alternative solution
#mask = FirstDF.index.isin(list(map(tuple,SecondDF[['A','B']].values.tolist())))
df = FirstDF.loc[mask, ['C']].reset_index(drop=True)
print (df)
    C
0  59
1  56
2  80

Detail:

print (FirstDF.loc[mask, ['C']])
              C
A   B          
'a' 'green'  59
'b' 'red'    56
'c' 'green'  80

EDIT:

You can use merge with outer join and indicator=True parameter, then filter by boolean indexing:

df1=FirstDF.merge(SecondDF, left_index=True, right_on=['A','B'], indicator=True, how='outer')
print (df1)
    C    A         B     _merge
2  43  'a'    'blue'  left_only
0  59  'a'   'green'       both
1  56  'b'     'red'       both
2  80  'c'   'green'       both
2  72  'c'  'orange'  left_only

mask = df1['_merge'] != 'both'
df1 = df1.loc[mask, ['C']].reset_index(drop=True)
print (df1)
    C
0  43
1  72

For second solution invert boolen mask by ~:

mask = FirstDF.index.isin(SecondDF.set_index(['A','B']).index)
#alternative solution
#mask = FirstDF.index.isin(list(map(tuple,SecondDF[['A','B']].values.tolist())))
df = FirstDF.loc[~mask, ['C']].reset_index(drop=True)
print (df)
    C
0  43
1  72

Is there a way of doing the opposite? Like, with the same dataframes, getting: C 43 72.

jimmy · Accepted Answer · 2018-10-18 10:14:27Z

FirstDF.loc[zip(SecondDF['A'],SecondDF['B']),]

Explanation:-

Idea is to get the indexes from second data frame and use them on first data frame. For multi-indexes you can pass the tuple of indexes to get the row.

FirstDF.loc[('bar','two'),]

will give you all the rows whose first index is 'bar and second index is 'two'.

FirstDF.loc[(SecondDF['A'],SecondDF['B']),]

takes those indexes directly from SecondDF which you want but the catch is it will take all the combinations of 'A' and 'B'. So adding zip will take only the indexes which are part of same row in SecondDF

Praveen · Accepted Answer · 2018-10-18 06:14:33Z

0

You can use merge to get the result;

In [35]: df1
Out[35]:
   A       B   C
0  a    blue  43
1  a   green  59
2  b     red  56
3  c   green  80
4  c  orange  72

In [36]: df2
Out[36]:
   A      B
0  a  green
1  b    red
2  c  green

In [37]: pd.merge(df1, df2, on=['A', 'B'])['C']
Out[37]:
0    59
1    56
2    80
Name: C, dtype: int64

answered Oct 18, 2018 at 6:14

Praveen

9,4434 gold badges38 silver badges51 bronze badges

Comments

Andrés Marulanda · Accepted Answer · 2018-10-20 20:40:30Z

0

Ok I found an answer:

tuple_list = list(map(tuple,SecondDF.values))
insDF = FirstDF.loc[tuple_list].dropna()
outsDF = FirstDF.loc[~FirstDF.index.isin(tuple_list)]

This gives both the values that are and the values that are not in FirstDF. The dropna method is used here because this querying leaves the values in SecondDF that are not in FirstDF as NaN, so they should be dropped.

answered Oct 20, 2018 at 20:40

Andrés Marulanda

4483 silver badges9 bronze badges

Collectives™ on Stack Overflow

Querying MultiIndex DataFrame in Pandas

4 Answers 4

1 Comment

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

Comments

Linked

Related