1

I have the following data frame:

df = pd.DataFrame({'id': [1, 2, 3, 4],
                   'a': ['on', 'on', 'off', 'off'],
                   'b': ['on', 'off', 'on', 'off']})

How can I create a new column df['new'] with the type of NumPy arrays/lists so that I can perform operations like:

df.loc[1, 'new'] = np.array([2 , 'l'])
#or
df.loc[1, 'new'] = [2 , 'l']
4
  • Do you really want arrays as opposed to lists? An object dtype array column can hold anything - array, list, dict, string, None` Commented Aug 18, 2021 at 19:20
  • @hpaulj list is fine for me as well Commented Aug 18, 2021 at 19:21
  • Why do you need to "declare" the type of object you are putting in a dataframe? Just put it there. Commented Aug 18, 2021 at 19:21
  • @Aryerez I thought so at first, but I was getting the following error when I do it directly:ValueError: cannot set using a multi-index selection indexer with a different length than the value Commented Aug 18, 2021 at 19:23

1 Answer 1

2

Specify the datatype as "object" while creating the new column and then insert the elements as needed:

df["new"] = pd.Series(dtype="object")
df.at[1, 'new'] = [2 , 'l']
>>> df
   id    a    b     new
0   1   on   on     NaN
1   2   on  off  [2, l]
2   3  off   on     NaN
3   4  off  off     NaN
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.