how do I insert a column at a specific column index in pandas?

Question

Can I insert a column at a specific column index in pandas?

import pandas as pd
df = pd.DataFrame({'l':['a','b','c','d'], 'v':[1,2,1,2]})
df['n'] = 0

This will put column n as the last column of df, but isn't there a way to tell df to put n at the beginning?

Insert a column at the beginning (leftmost end) of a DataFrame - more solutions + generalised solution for inserting any sequence (not just a constant value). — cs95
– cs95, Commented Feb 5, 2019 at 22:03

user343233 · Accepted Answer · 2024-01-07 22:36:34Z

740

see docs: http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.insert.html

using loc = 0 will insert at the beginning

df.insert(loc, column, value)

df = pd.DataFrame({'B': [1, 2, 3], 'C': [4, 5, 6]})

df
Out: 
   B  C
0  1  4
1  2  5
2  3  6

idx = 0
new_col = [7, 8, 9]  # can be a list, a Series, an array or a scalar   
df.insert(loc=idx, column='A', value=new_col)

df
Out: 
   A  B  C
0  7  1  4
1  8  2  5
2  9  3  6

edited Jan 7, 2024 at 22:36

user343233

1351 silver badge9 bronze badges

answered Sep 7, 2013 at 15:32

Jeff

129k21 gold badges223 silver badges189 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Peter Maguire Over a year ago

For future users, the new parameters are "loc", "column", and "value". Source

mLstudent33 Over a year ago

I counted and recounted the length of values and length of index after printing but keep getting ValueError: Length of values does not match length of index

Sulphur Over a year ago

For future users, if you want to insert with the help of specific column name instead of index, use: df.insert(df.columns.get_loc('col_name'), 'new_col_name', ser_to_insert). insert doesn't directly support column name use case but you can get the column index from column name and pass that.

DanielBell99 Over a year ago

Replace value with pd.Series(value_list)

Hugo Vares · Accepted Answer · 2020-01-12 14:45:55Z

88

If you want a single value for all rows:

df.insert(0,'name_of_column','')
df['name_of_column'] = value

Edit:

You can also:

df.insert(0,'name_of_column',value)

edited Jan 12, 2020 at 14:45

answered Dec 22, 2019 at 20:22

Hugo Vares

1,1678 silver badges7 bronze badges

3 Comments

Brian Wylie Over a year ago

This df.insert(0,'name_of_column',value) was exactly what I needed.. thanks :)

arman_aegit Over a year ago

Is there anyway to get a copy of the dataframe with the inserted column and keep the original intact?

Amin.A Over a year ago

@arman_aegit: While chaining, you can try data.copy().pipe(lambda df: (df.insert(0, 'test', 100), df)[1]). This will preserve data dataframe while adding a test column at the begining of the dataframe in the chain. Note that .insert() is inplace. So you have to make a copy of your data if you want to preserve it.

mhc · Accepted Answer · 2021-02-19 13:28:48Z

df.insert(loc, column_name, value)

This will work if there is no other column with the same name. If a column, with your provided name already exists in the dataframe, it will raise a ValueError.

You can pass an optional parameter allow_duplicates with True value to create a new column with already existing column name.

Here is an example:



    >>> df = pd.DataFrame({'b': [1, 2], 'c': [3,4]})
    >>> df
       b  c
    0  1  3
    1  2  4
    >>> df.insert(0, 'a', -1)
    >>> df
       a  b  c
    0 -1  1  3
    1 -1  2  4
    >>> df.insert(0, 'a', -2)
    Traceback (most recent call last):
      File "", line 1, in 
      File "C:\Python39\lib\site-packages\pandas\core\frame.py", line 3760, in insert
        self._mgr.insert(loc, column, value, allow_duplicates=allow_duplicates)
      File "C:\Python39\lib\site-packages\pandas\core\internals\managers.py", line 1191, in insert
        raise ValueError(f"cannot insert {item}, already exists")
    ValueError: cannot insert a, already exists
    >>> df.insert(0, 'a', -2,  allow_duplicates = True)
    >>> df
       a  a  b  c
    0 -2 -1  1  3
    1 -2 -1  2  4

This is brilliant, actually also suggested in Pandas official documentation. Thanks for bringing this up @mhc

cs95 · Accepted Answer · 2017-09-19 18:56:00Z

17

You could try to extract columns as list, massage this as you want, and reindex your dataframe:

>>> cols = df.columns.tolist()
>>> cols = [cols[-1]]+cols[:-1] # or whatever change you need
>>> df.reindex(columns=cols)

   n  l  v
0  0  a  1
1  0  b  2
2  0  c  1
3  0  d  2

EDIT: this can be done in one line ; however, this looks a bit ugly. Maybe some cleaner proposal may come...

>>> df.reindex(columns=['n']+df.columns[:-1].tolist())

   n  l  v
0  0  a  1
1  0  b  2
2  0  c  1
3  0  d  2

edited Sep 19, 2017 at 18:56

cs95

406k106 gold badges744 silver badges794 bronze badges

answered Sep 7, 2013 at 14:24

Nic

3,5173 gold badges23 silver badges31 bronze badges

Comments

Ka Wa Yip · Accepted Answer · 2022-03-14 17:31:32Z

4

A general 4-line routine

You can have the following 4-line routine whenever you want to create a new column and insert into a specific location loc.

df['new_column'] = ... #new column's definition
col = df.columns.tolist()
col.insert(loc, col.pop()) #loc is the column's index you want to insert into
df = df[col]

In your example, it is simple:

df['n'] = 0
col = df.columns.tolist()
col.insert(0, col.pop()) 
df = df[col]

edited Mar 14, 2022 at 17:31

answered Mar 14, 2022 at 17:25

Ka Wa Yip

3,0594 gold badges27 silver badges39 bronze badges

Comments

rra · Accepted Answer · 2020-06-18 19:17:41Z

Here is a very simple answer to this(only one line).

You can do that after you added the 'n' column into your df as follows.

import pandas as pd
df = pd.DataFrame({'l':['a','b','c','d'], 'v':[1,2,1,2]})
df['n'] = 0

df
    l   v   n
0   a   1   0
1   b   2   0
2   c   1   0
3   d   2   0

# here you can add the below code and it should work.
df = df[list('nlv')]
df

    n   l   v
0   0   a   1
1   0   b   2
2   0   c   1
3   0   d   2



However, if you have words in your columns names instead of letters. It should include two brackets around your column names. 

import pandas as pd
df = pd.DataFrame({'Upper':['a','b','c','d'], 'Lower':[1,2,1,2]})
df['Net'] = 0
df['Mid'] = 2
df['Zsore'] = 2

df

    Upper   Lower   Net Mid Zsore
0   a       1       0   2   2
1   b       2       0   2   2
2   c       1       0   2   2
3   d       2       0   2   2

# here you can add below line and it should work 
df = df[list(('Mid','Upper', 'Lower', 'Net','Zsore'))]
df

   Mid  Upper   Lower   Net Zsore
0   2   a       1       0   2
1   2   b       2       0   2
2   2   c       1       0   2
3   2   d       2       0   2

What if we wanted to add a few columns from another df_other to the loc 0 and a few columns from df_other to the end of our df?

Collectives™ on Stack Overflow

how do I insert a column at a specific column index in pandas?

6 Answers 6

4 Comments

3 Comments

1 Comment

Comments

A general 4-line routine

Comments

Here is a very simple answer to this(only one line).

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

4 Comments

3 Comments

1 Comment

Comments

A general 4-line routine

Comments

Here is a very simple answer to this(only one line).

1 Comment

Linked

Related