0

I've the following Data Frame.

data = pd.read_csv("Example.csv")
data["Column1"]

     Column0   Column1

0      a       Gold 
1      b       Silver  
2      b       Silver (Running)
3      c       Bronze (800m)
4      c       Bronze 
5      a       2x Gold (500m)
6      a       Really Successful, 2x WM Gold (500m)

My Goal is to replace some of the Strings with only the Medals.

data = pd.read_csv("Example.csv")
data["Column1"]

     Column0     Column1

0      a         Gold
1      b         Silver
2      b         Silver
3      c         Bronze
4      c         Bronze
5      a         Gold
6      a         Gold
7      a         Gold
8      a         Gold

I've already tried the replace() method. But it doesnt work. Like this :

data[Column1] = data.replace({"Column1": "Silver"}, "Silver)

4 Answers 4

4

You can try str.extract

df['Column1'] = df['Column1'].str.extract('(Gold|Silver|Bronze)')
print(df)

  Column0 Column1
0       a    Gold
1       b  Silver
2       b  Silver
3       c  Bronze
4       c  Bronze
5       a    Gold
6       a    Gold

To ignore case, you can use flags argument

import re

df['Column1'] = df['Column1'].str.extract('(gold|silver|bronze)', flags=re.IGNORECASE)
Sign up to request clarification or add additional context in comments.

Comments

0

Try using:

data[Column1] = data.replace({'Silver (Running)':'Silver'})
data[Column1]

Comments

0

You need to define clearly the problem that you want to solve Your problem here is not a use case for replace, what you want to do is to keep only the medal in the column "Column1", and not to replace the whole string. You might solve this problem as follows Creation of the data frame

df = pd.DataFrame({"Column0": ["a","b","b","c","c","a","a",], "Column1":[
    "Gold ",
    "Silver  ",
    "Silver (Running)",
    "Bronze (800m)",
    "Bronze ",
    "2x Gold (500m)",
    "Really Successful, 2x WM Gold (500m)",
]})

You can use apply on the column Column1 using the following function

def replace_string_by_medal(string):
    for medal in ["Gold","Silver","Bronze"]:
        if medal in string:
            return medal

df.Column1.apply(replace_string_by_medal)

This will return a column that has what you want and you can replace the column Column1 with the new value

df.loc["Column1"] = df.Column1.apply(replace_string_by_medal)

df
    Column0 Column1
0   a       Gold
1   b       Silver
2   b       Silver
3   c       Bronze
4   c       Bronze
5   a       Gold
6   a       Gold

1 Comment

apply is not efficient for this operation, futhermore, using an additional for loop in your function makes it quadratic, you should not have to search as many times as there are possibilities!
0

As you have a defined list of possibilities, the easiest is to use str.extract:

df['Column1'] = df['Column1'].str.extract('(Gold|Silver|Bronze)')

output:

  Column0 Column1
0       a    Gold
1       b  Silver
2       b  Silver
3       c  Bronze
4       c  Bronze
5       a    Gold
6       a    Gold

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.