How to select column and rows in pandas without column or row names?

Question

I have a pandas dataframe(df) like this

                         Close      Close     Close Close       Close
Date                                                                 
2000-01-03 00:00:00        NaN        NaN       NaN   NaN   -0.033944
2000-01-04 00:00:00        NaN        NaN       NaN   NaN   0.0351366
2000-01-05 00:00:00  -0.033944        NaN       NaN   NaN  -0.0172414
2000-01-06 00:00:00  0.0351366  -0.033944       NaN   NaN -0.00438596
2000-01-07 00:00:00 -0.0172414  0.0351366 -0.033944   NaN   0.0396476

in R If I want to select fifth column

five=df[,5]

and without 5th column

rest=df[,-5]

How can I do similar operations with pandas dataframe

I tried this in pandas

five=df.ix[,5]

but its giving this error

 File "", line 1
    df.ix[,5]
           ^
SyntaxError: invalid syntax

piRSquared · Accepted Answer · 2016-08-26 07:26:05Z

18

Use iloc. It is explicitly a position based indexer. ix can be both and will get confused if an index is integer based.

df.iloc[:, [4]]

For all but the fifth column

slc = list(range(df.shape[1]))
slc.remove(4)

df.iloc[:, slc]

or equivalently

df.iloc[:, [i for i in range(df.shape[1]) if i != 4]]

answered Aug 26, 2016 at 7:26

piRSquared

295k68 gold badges509 silver badges654 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

theshubhagrwl · Accepted Answer · 2020-05-13 05:34:23Z

4

If your DataFrame does not have column/row labels and you want to select some specific columns then you should use iloc method.

example if you want to select first column and all rows:

df = dataset.iloc[:,0]

Here the df variable will contain the value stored in the first column of your dataframe.

Do remember that

type(df) -> pandas.core.series.Series

Hope it helps

answered May 13, 2020 at 5:34

theshubhagrwl

1,07412 silver badges21 bronze badges

Comments

Hanshan · Accepted Answer · 2016-08-26 05:59:14Z

2

If you want the fifth column:

df.ix[:,4]

Stick the colon in there to take all the rows for that column.

To exclude a fifth column you could try:

df.ix[:, (x for x in range(0, len(df.columns)) if x != 4)]

edited Aug 26, 2016 at 5:59

answered Aug 26, 2016 at 5:16

Hanshan

3,7645 gold badges32 silver badges36 bronze badges

5 Comments

Eka Over a year ago

its showing this error IndexError: index 5 is out of bounds for axis 0 with size 5 but df.ix[:,4] working

Hanshan Over a year ago

Sorry it is zero-based, so you want 4

Eka Over a year ago

Ok got it . in python its start from 0 and in R it starts from 1

Hanshan Over a year ago

As for exclusion you can use standard Python language features to create ranges, etc, in your slicing/indexing - I've added an example.

9769953 Over a year ago

Usage of .ix has been abandoned: use .iloc instead.

Nehal J Wani · Accepted Answer · 2016-08-26 05:24:48Z

To select filter column by index:

In [19]: df
Out[19]: 
                 Date     Close   Close.1   Close.2  Close.3   Close.4
0  2000-01-0300:00:00       NaN       NaN       NaN      NaN -0.033944
1  2000-01-0400:00:00       NaN       NaN       NaN      NaN  0.035137
2  2000-01-0500:00:00 -0.033944       NaN       NaN      NaN -0.017241
3  2000-01-0600:00:00  0.035137 -0.033944       NaN      NaN -0.004386
4  2000-01-0700:00:00 -0.017241  0.035137 -0.033944      NaN  0.039648

In [20]: df.ix[:, 5]
Out[20]: 
0   -0.033944
1    0.035137
2   -0.017241
3   -0.004386
4    0.039648
Name: Close.4, dtype: float64

In [21]: df.icol(5)
/usr/bin/ipython:1: FutureWarning: icol(i) is deprecated. Please use .iloc[:,i]
  #!/usr/bin/python2
Out[21]: 
0   -0.033944
1    0.035137
2   -0.017241
3   -0.004386
4    0.039648
Name: Close.4, dtype: float64

In [22]: df.iloc[:, 5]
Out[22]: 
0   -0.033944
1    0.035137
2   -0.017241
3   -0.004386
4    0.039648
Name: Close.4, dtype: float64

To select all columns except index:

In [29]: df[[df.columns[i] for i in range(len(df.columns)) if i != 5]]
Out[29]: 
                 Date     Close   Close.1   Close.2  Close.3
0  2000-01-0300:00:00       NaN       NaN       NaN      NaN
1  2000-01-0400:00:00       NaN       NaN       NaN      NaN
2  2000-01-0500:00:00 -0.033944       NaN       NaN      NaN
3  2000-01-0600:00:00  0.035137 -0.033944       NaN      NaN
4  2000-01-0700:00:00 -0.017241  0.035137 -0.033944      NaN

Collectives™ on Stack Overflow

How to select column and rows in pandas without column or row names?

4 Answers 4

Comments

Comments

5 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

5 Comments

Comments

Related