15

According to this thread: SO: Column names to list

It should be straightforward to do convert the column names to a list. But if i do:

df.columns.tolist()

I do get:

[u'q_igg', u'q_hcp', u'c_igg', u'c_hcp']

I know, i could get rid of the u and the ' . But i would like to just get the clean names as list without any hack around. Is that possible ?

1
  • 3
    This is correct, it just indicates that the strings are Unicode strings. Commented Nov 25, 2014 at 14:23

6 Answers 6

23

Or, you could try:

df2 = df.columns.get_values()

which will give you:

array(['q_igg', 'q_hcp', 'c_igg', 'c_hcp'], dtype=object)

then:

df2.columns.tolist()

which gives you:

['q_igg', 'q_hcp', 'c_igg']
Sign up to request clarification or add additional context in comments.

5 Comments

pretty verbose .. but maybe that's the only way ..?
Slightly less verbose: df.columns.values.tolist()
The get_values() method is depreciated: "FutureWarning: The 'get_values' method is deprecated and will be removed in a future version. Use '.to_numpy()' or '.array' instead."
Please update your answer, since it is still the accepted answer.
try this : list(df2)
4

Simple and easy way: df-dataframe variable name

df.columns.to_list()

this will give the list of the all columns name.

Comments

3

The list [u'q_igg', u'q_hcp', u'c_igg', u'c_hcp'] contains Unicode strings: the u indicates that they're Unicode strings and the ' are enclosed around each string. You can now use these names in any way you'd like in your code. See Unicode HOWTO for more details on Unicode strings in Python 2.x.

Comments

1

If you're just interested in printing the name without an quotes or unicode indicators, you could do something like this:

In [19]: print "[" + ", ".join(df) + "]"
[q_igg, q_hcp, c_igg, c_hcp]

Comments

1

As already mentioned the u means that its unicode converted. Anyway, the cleanest way would be to convert the colnames to ascii or something like that.

In [4]: cols
Out[4]: [u'q_igg', u'q_hcp', u'c_igg', u'c_hcp']

In [5]: [i.encode('ascii', 'ignore') for i in cols]
Out[5]: ['q_igg', 'q_hcp', 'c_igg', 'c_hcp'

The problem here is that you would lose special characters that are not encode in ascii.

A much more dirty solution would be to fetch the string representation of the list object and just replace the u. I would not use that but it might befit your needs in this special case ;-)

In [7]: repr(cols)
Out[7]: "[u'q_igg', u'q_hcp', u'c_igg', u'c_hcp']"
In [11]: x.replace("u", "")
Out[11]: "['q_igg', 'q_hcp', 'c_igg', 'c_hcp']"

see: https://docs.python.org/2/library/repr.html

1 Comment

Commenting on behalf of @AsheKetchum who doesn't have enough rep: The downside of .replace is that it might replace 'u' if your original variables have u in their names. e.g. "u'q_ugg'" would become "'q_gg'"
0

this will do the job

list(df2)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.