0

I am using Python 3.8.6 with pandas version 1.2.4

I want to do a self join on previous rows with this dataframe:

            bar   
            one  
index                                                                                
0      0.238307 
1      0.610819 

so i prepare the dataframe before doing a pandas merge

the "left" merge data looks like this:

        bar   
        one  
0  0.238307  
1  0.610819 

the "right" merge data looks like this:

        bar   index1
        one         
0  0.238307      1
1  0.610819      2

now i try this merge:

pd.merge(left, right, left_index=True, right_on=('index1',''), suffixes=('_n','_p'))

It throws a ValueError: len(right_on) must equal the number of levels in the index of "left"

To me, this makes no sense. What counts is that the values of ('index1,'') are comparable to left.index

What am i missing?

i have also tried the following: left

  index       bar  
              one  
0     0  0.972453 
1     1  0.278209 

right

        bar  index1
        one         
0  0.972453      1
1  0.278209      2

merge expression

pd.merge(left,right,left_on=('index',''),right_on=('index1',''),suffixes=('_n','_p'))

error

 raise KeyError(key)
KeyError: ''

NB

left.loc[:,('index','')]
0    0
1    1
2    2
right.loc[:,('index1','')]
0    1
1    2
2    3

So again, some problem i don't understand

Thanks Martin

2 Answers 2

0

Is it what you expect:

>>> pd.merge(dfL, dfR, left_index=True, right_on='index1', suffixes=('_n','_p'))

      bar_n     bar_p index1
        one       one
0  0.610819  0.238307      1

From merge documentation: If it is a MultiIndex, the number of keys in the other DataFrame (either the index or a number of columns) must match the number of levels.

Sign up to request clarification or add additional context in comments.

1 Comment

Hi and thanks for your answer. Yes the result is what i expect, but the expression is not what i expected, given the columns are multi-index. Seems to reflect poorly on the documentation/design of this aspect of pandas :-(
0

The answer I discovered is:

left=left.set_index(('index',''))
right=right.set_index(('index1',''))
dfJoined=pd.merge(left,right,left_index=True,right_index=True,suffixes=('_n','_p'))

produces

      bar_n     bar_p     
        one       one
1  1.719833 -0.540152 #different random numbers from question above

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.