33

Is it possible when creating a dataframe from a list, to set the index as one of the values?

import pandas as pd

tmp = [['a', 'a1'], ['b',' b1']]

df = pd.DataFrame(tmp, columns=["First", "Second"])

        First  Second
0          a   a1
1          b   b1

And how I'd like it to look:

        First  Second
a          a   a1
b          b   b1
2
  • 8
    df.index = df.First Commented Aug 24, 2016 at 20:07
  • 2
    Note: as others have mentioned, if you would like to make an existing column as index opt-1: df.set_index('col_name', inplace=True), if you would like to use an external object like list, pd.Series as your index instead opt-2: df.index = list_1 Commented Aug 19, 2019 at 18:05

5 Answers 5

25

Change it to list before assigning it to index

df.index = list(df["First"])
Sign up to request clarification or add additional context in comments.

1 Comment

This option is versatile, enables creating index from external objects (lists, pd.Series etc), and also answers OP's question too.
16
>>> pd.DataFrame(tmp, columns=["First", "Second"]).set_index('First', drop=False)
      First Second
First             
a         a     a1
b         b     b1

1 Comment

If I want to set another list as index (let say ['x','y']) which is not the existing column of the data frame then how can I do it?
14

set_axis

To set arbitrary values as the index, best practice is to use set_axis:

df = df.set_axis(['idx1', 'idx2'])

#       First  Second
# idx1      a      a1
# idx2      b      b1

set_index (list vs array)

It's also possible to pass arbitrary values to set_index, but note the difference between passing a list vs array:

  • list — set_index assigns these columns as the index:

    df.set_index(['First', 'First'])
    
    #              Second
    # First First        
    # a     a          a1
    # b     b          b1
    
  • array (Series/Index/ndarray) — set_index assigns these values as the index:

    df = df.set_index(pd.Series(['First', 'First']))
    
    #        First  Second
    # First      a      a1
    # First      b      b1
    

    Note that passing arrays to set_index is very contentious among the devs and may even get deprecated.


Why not just modify df.index directly?

Directly modifying attributes is fine and is used often, but using methods has its advantages:

  • Methods provide better error checking, e.g.:

    df = df.set_axis(['idx1', 'idx2', 'idx3'])
    
    # ValueError: Length mismatch: Expected axis has 2 elements, new values have 3 elements
    
    df.index = ['idx1', 'idx2', 'idx3']
    
    # No error despite length mismatch
    
  • Methods can be chained, e.g.:

    df.some_method().set_axis(['idx1', 'idx2']).another_method()
    

1 Comment

set_axis() is also the way to go to make mypy happy! df.index = [...] makes mypy unhappy because df.index is not of type list.
7

If you don't want index name:

df = pd.DataFrame(tmp, columns=["First", "Second"], index=[i[0] for i in tmp])

Result:

  First Second
a     a     a1
b     b     b1

Comments

1
import pandas as pd
tmp = [['a', 'a1'], ['b',' b1']]
df = pd.DataFrame(tmp, columns=["First", "Second"]).set_axis([tmp[0][0],tmp[1][0]])
df

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.