0

I am looking to make a function to break a string into a list of str by breaking it at various punctuation points (e.g. , ! ?) that I specify. I know I should used the .split() function with the specific punctuation, however I can't figure out how to get iterate running the split with each punctuation character specified to produce a single list of str with made up from the original str split at every punctuation character.

1 Answer 1

3

To split with multiple delimiters, you should use re.split():

import re
pattern = r"[.,!?]"  # etc.
new = re.split(pattern, your_current_string)

Putting that in function form should be simple enough.

Sign up to request clarification or add additional context in comments.

4 Comments

Using your method I get a list of empty strings. re.split(pattern, "Hello!I'd like, to say something. 'World'.") returns '["", "", "", "", "", "", ""]'
@bvidal That's because I forgot to escape the full stop (which meant it was splitting on everything); thanks for telling me. Try it again now.
It's probably a better idea to write the regex directly (pattern = r"[.,!?]"), or use re.escape: pattern='|'.join(map(re.escape, delimiters)).
@YannVernier I'd say that's definitely a better idea. Edited.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.