Replacing Strings In Pandas Column

Question

I have a pandas column as such (example):

1                               France
2                               France
3                              Germany
4                              Germany
5                              Germany
6                                Spain
7                                Spain
8                                Spain
175                           France.2
176                           France.2
177                          Germany.2
178                          Germany.2
179                          Germany.2
180                               UK.1
181                               UK.1
182                               UK.1
183                            Italy.2
184                            Italy.2
185                            Italy.2

This would be my index and df[0].

I am trying to locate the ".1" and ".2" up to ".4". and remove them.

rename_rows = ['.1', '.2', '.3', '.4']
for row in df[0]:
    for r in rename_rows:
        if r in row:
            df[0] = df[0].replace(r, '')

Nothing happens when this occurs.

If get down to the last loop "if r in row:" and I say print('True') it completes correctly. I've also tried replacing the df[0] = df[0].replace(r, '') to instead be df[0] = df[0].replace(row, '') and it successfully deletes the enter country name. However, I just want to delete the ".1" portion.

Any thoughts on why it won't delete that portion only?

Quang Hoang · Accepted Answer · 2020-11-05 21:40:40Z

2

You can use str.extract:

df[0].str.extract('^([^\.]+)')

Output:

           0
1     France
2     France
3    Germany
4    Germany
5    Germany
6      Spain
7      Spain
8      Spain
175   France
176   France
177  Germany
178  Germany
179  Germany
180       UK
181       UK
182       UK
183    Italy
184    Italy
185    Italy

answered Nov 5, 2020 at 21:40

Quang Hoang

151k11 gold badges63 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Friedrich · Accepted Answer · 2024-07-19 12:03:44Z

1

Use .str.replace() to replace the endings you don't want by an empty string:

df['country'].str.replace(r'\.[0-4]$', '')

Explanation of the regex:
the $ stands for the end of the string, so when the string ends with a literal . followed by numbers 0 to 4, this should be replaced by an empty string.

edited Jul 19, 2024 at 12:03

Friedrich

5,44915 gold badges80 silver badges62 bronze badges

answered Nov 5, 2020 at 21:45

Sander van den Oord

13k5 gold badges71 silver badges123 bronze badges

Collectives™ on Stack Overflow

Replacing Strings In Pandas Column

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related