0

I have a dataframe whose columns are numeric indexes, which aren't necessarily contiguous. I want to add a new column to it with a particular index, similar to:

df[4] = [1,2,3,4]

But without modifying the existing dataframe. df.assign only accepts kwargs (it can't be directly passed an actual dictionary), and even the (rather kludgy anyway) method of expanding a non-str-keyed dict as kwargs is explicitly guarded against:

>>> df.assign(**{4: [1,2,3,4]})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: assign() keywords must be strings

Using pd.concat works, but has a lot of line noise:

>>> a
   4  0  1  2  3
0  1  1  2  3  4
1  2  2  3  5  4
>>> pd.concat([a, pd.DataFrame({6: [1,2]})], axis=1)
   4  0  1  2  3  6
0  1  1  2  3  4  1
1  2  2  3  5  4  2

Is there a nicer way?

3
  • Is the line noise generated by pandas, or just a result value? IOW, can you do _ = pd.concat(...) and get rid of it? Commented Aug 1, 2015 at 14:36
  • @PatrickMaupin I mean the command itself is noisy. Something about all three types of closing bracket in a row, with the square one repeated doesn't quite strike me as nice looking code. Commented Aug 1, 2015 at 14:57
  • Right you are. Misunderstood; need coffee... khammel's answer looks a bit better, no? Commented Aug 1, 2015 at 15:02

2 Answers 2

1

Join will return a copy instead of affecting the existing dataframe (joins the two dataframes on the matching indexes):

>>> a.join(pd.DataFrame({6: [1,2]}))
   4  0  1  2  3  6
0  1  1  2  3  4  1
1  2  2  3  5  4  2

>>> a
   4  0  1  2  3
0  1  1  2  3  4
1  2  2  3  5  4
Sign up to request clarification or add additional context in comments.

1 Comment

And, of course, this column actually comes from another dataframe and the label isn't changing, so I can just join it directly. Probably could have done that with concat too. We should totally have a badge for x/y problem.
0

And, use join and series

In [870]: a.join(pd.Series([1,2], name=6))
Out[870]:
   4  0  1  2  3  6
0  1  1  2  3  4  1
1  2  2  3  5  4  2

In [871]: a
Out[871]:
   4  0  1  2  3
0  1  1  2  3  4
1  2  2  3  5  4

Or, another hacky way using assign is to rename string columns to int

In [892]: a.assign(**{'6': [1,2]}).rename(columns=pd.to_numeric)
Out[892]:
   4  0  1  2  3  6
0  1  1  2  3  4  1
1  2  2  3  5  4  2

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.