0

I want to replace "?" with NaN in Python. The following code does not work, and I am not sure what is the reason.

import pandas as pd; 
import numpy as np; 
col_names = ['BI_RADS', 'age','shape','margin','density','severity']
dataset = pd.read_csv('mammographic_masses.data.txt', names = col_names)
dataset.replace("?", np.NaN)

After executing the above code, I still get those question marks in the dataset. The format of the dataset looks like the followings:

5,67,3,5,3,1

4,43,1,1,?,1

5,58,?,5,3,1

4,28,1,1,3,0

5,74,1,5,?,1
5
  • 1
    Can you post a sample of your dataframe? I am having a hard time reproducing your issue. Commented Jun 3, 2018 at 4:27
  • Post it in your question please Commented Jun 3, 2018 at 4:32
  • thanks! I just posted it. Commented Jun 3, 2018 at 4:35
  • I am having no issues replacing ? with NaN Commented Jun 3, 2018 at 4:36
  • I used another code, which works. Just wondering why the codes in the question do not work. The following code works: dataset = pd.read_csv('mammographic_masses.data.txt', names = col_names, na_values = "?") Commented Jun 3, 2018 at 4:42

1 Answer 1

1

Use inplace=True

Ex:

dataset.replace("?", np.NaN, inplace=True)
Sign up to request clarification or add additional context in comments.

1 Comment

It works! Or I can use dataset = dataset.replace("?", np.NaN)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.