4

Ok, my real problem is bigger than this, but I have a simple working example.

>>> import pandas as pd
>>> import numpy as np
>>> a = pd.DataFrame(np.array([[2, 1990], [4,1991], [5,1992]]), \
...                  index=[1,2,3], columns=['var', 'yr'])
>>> a
   var    yr
1    2  1990
2    4  1991
3    5  1992
>>> b = pd.DataFrame(index=a.index, columns=['new_var'])
>>> b
  new_var
1     NaN
2     NaN
3     NaN
>>> b[a.yr<1992].loc[:, 'new_var'] = a[a.yr<1992].loc[:, 'var']
>>> b
  new_var
1     NaN
2     NaN
3     NaN

I desire the following output:

>>> b
  new_var
1       2
2       4
3     NaN

3 Answers 3

3

With that filtering stuff, you're creating a copy of a slice, and thus it won't assign.

Do this instead:

b.loc[a.yr<1992, 'new_var'] = a['var']

Sign up to request clarification or add additional context in comments.

2 Comments

This is a good answer! However, the slice on a is unnecessary. This will suffice b.loc[a.yr<1992, 'new_var'] = a['var'] pandas will handle the alignment for you. +1 from me.
Cool. Yeah, Pandas seems to be pretty good at being reasonably concise.
1

you can also use assign + query to add intuitiveness

b.assign(new_var=a.query('yr < 1992')['var'])

   new_var
1      2.0
2      4.0
3      NaN

This returns the dataframe you'd want. You'll have to assign it back to b if you want it to persist.

2 Comments

this is rather unusual use case for assign + query ;-)
@MaxU I'm always trying to push on the edges.
0

yet another "creative" solution:

In [181]: b['new_var'] = np.where(a.yr < 1992, a['var'], b['new_var'])

In [182]: b
Out[182]:
  new_var
1       2
2       4
3     NaN

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.