Pandas remove column by index

Question

Suppose I have a DataFrame like this:

>>> df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['a','b','b'])
>>> df
   a  b  b
0  1  2  3
1  4  5  6
2  7  8  9

And I want to remove second 'b' column. If I just use del statement, it'll delete both 'b' columns:

>>> del df['b']
>>> df
   a
0  1
1  4
2  7

I can select column by index with .iloc[] and reassign DataFrame, but how can I delete only second 'b' column, for example by index?

That's interesting. Reassigning sounds the appropriate move. Thinking twice, you know you want to delete 2nd b not based of the column names as you have duplicates but indeed on an index. Thus your algorithm somehow uses that index. So why just not change the columns to an index based in that case? — Zeugma
– Zeugma, Commented Nov 14, 2013 at 9:51
@Boud good suggestion, actually I could rename all columns which I want to delete and then delete by name, will try when will get to home — roman
– roman, Commented Nov 14, 2013 at 10:02
afaik, del df['b'] translates to block manager command to remove relative items from all blocks, i.e. roughly equals to reassignment df = df.iloc[:,:2] — alko
– alko, Commented Nov 14, 2013 at 10:23

Puffin GDI · Accepted Answer · 2013-11-14 09:57:54Z

6

df = df.drop(['b'], axis=1).join(df['b'].ix[:, 0:1])

>>> df
   a  b
0  1  2
1  4  5
2  7  8

Or just for this case

df = df.ix[:, 0:2]

But I think it has other better ways.

answered Nov 14, 2013 at 9:57

Puffin GDI

1,7125 gold badges27 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

This is the best way of retaining the first instance of a duplicate column that I have yet found!