Find and Replace Python

Question

I am working with pandas and a rather large excel document. My goal is to find and replace particular characters in a string and replace them with nothing, essentially removing the characters. The strings are in a particular column. Below you will see the code that I have created to find and replace, however python is not giving me an error message, and when I checked the saved file nothing has changed. What am I doing wrong?

import pandas as pd

df1 = pd.read_csv('2020.csv')

(df1.loc[(df1['SKU Code'].str.contains ('-DG'))])

dfDGremoved = (df1.loc[(df1['SKU Code'].str.contains('-DG'))].replace('-DG',''))

dfDGremoved.to_csv('2020DRAFT.csv')

Why check to see if the string contains what you're replacing. Just replace it first. Does this not work: df1['SKU Code'] = df1['SKU Code'].replace('-DG', ''). and then just df1.to_csv('2020DRAFT.csv') — Brian
– Brian, Commented Mar 3, 2020 at 20:12
The line (df1.loc[(df1['SKU Code'].str.contains ('-DG'))]) doesn't have any effect. — AMC
– AMC, Commented Mar 3, 2020 at 20:27

kadu · Accepted Answer · 2020-03-03 20:56:29Z

Your code is a bit overengineered, Python's replace method ignores strings which do not contain the substring you want to replace, so the contains call is unnecessary. Creating a second dataframe is also unnecessary, pandas can deal with in-place substitutions.

To achieve the result you want, you can use a map, which applies a function to every element in a Series (which a single column from a DataFrame is), combined to a lambda function:

df1 = pd.read_csv('2020.csv')
df1['SKU Code'] = df1['SKU Code'].map(lambda x: x.replace('-DG', '')
df1.to_csv('2020DRAFT.csv')

Unpacking this a bit:

df1['SKU Code'] = df1['SKU Code'].map(lambda x: x.replace('-DG', '')
  |                     |          |         └─ Create a nameless function which 
  |                     |          |            takes a string and removes '-DG'
  |                     |          |            from it 
  |                     |          |
  |                     |          └─ ...and run this function on every element...
  |                     |
  |                     └─ ... of the 'SKU Code' column in df1...
  |
  └── ... Then store the results in that same column

Varsha · Accepted Answer · 2020-03-03 20:56:32Z

1

You can use pandas.Series.str.replace(). It performs regex replace.

dfDGremoved = df1.copy()
dfDGremoved['SKU Code'] = dfDGremoved['SKU Code'].str.replace('-DG','')
dfDGremoved.to_csv('2020DRAFT.csv')

answered Mar 3, 2020 at 20:56

Varsha

3191 silver badge5 bronze badges

Collectives™ on Stack Overflow

Find and Replace Python

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related