2

In my dataset, i have a feature (called Size) like this one:

import pandas as pd


dit={"Size" : ["0","0","5","15","10"] }
dt = pd.DataFrame(data=dit)

when i run dt.info() it gives me the below result:

Size                                     140 non-null object

However, i expect it to be int. When i try the below code:

dt.loc[:,"Size"] = dt.loc[:,"Size"].astype(int)

it complains with:

ValueError: invalid literal for int() with base 10: ' '

How can i convert Size to int?

10
  • Does .to_numeric() work for you? Commented Jul 21, 2019 at 8:16
  • 2
    Which pandas version do you use? I used your code exactly and got Size 5 non-null int64 Commented Jul 21, 2019 at 8:18
  • 1
    dt['Size'] = dt['Size'].astype(int) - can you try this? Commented Jul 21, 2019 at 8:18
  • If I run the code, I get int like Tom too. Strange! Commented Jul 21, 2019 at 8:21
  • I perhaps got this when I ran dt.info() : <class 'pandas.core.frame.DataFrame'> RangeIndex: 5 entries, 0 to 4 Data columns (total 1 columns): Size 5 non-null int64 dtypes: int64(1) memory usage: 120.0 bytes Commented Jul 21, 2019 at 8:29

2 Answers 2

2

Use pd.to_numeric() :-

dit={"Size" : ['0','0','5','15','10'] }
dt = pd.DataFrame(data=dit)
dt['Size'] = pd.to_numeric(dt['Size'])
dt.info()

Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 1 columns):
Size    5 non-null int64
dtypes: int64(1)
memory usage: 120.0 bytes
Sign up to request clarification or add additional context in comments.

Comments

1

Here you have to select the column to be converted, use the .values to get the array containing all values and then use astype(dtype) to convert it to integer format.

dt['Size'].values.astype(int)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.