1

Hi first time regex user here. Just trying to figure out some regex but need some help.

I have a text file with the following items:

10:67 12:12 01:50 23:60 23:50

And I'm trying to get a list of the valid times so the output should be:

['12:12', '01:50', '23:50']

Here is my code:

import re
inFile = open("text.txt")
text = inFile.read()
pattern = re.findall('([01]\d|2[0-3]):[0-5]\d', text)
print pattern

My output is:

['12', '01', '23']

Any help figuring out whats wrong? Thanks!!!

1

1 Answer 1

6

Python apparently only prints the first group (that's ([01]\d|2[0-3]) in your case). If you make a non-capturing group ((?: ... )) of it, you should see the desired result:

text = '10:67 12:12 01:50 23:60 23:50'
pattern = re.findall('(?:[01]\d|2[0-3]):[0-5]\d', text)
print pattern

displays:

['12:12', '01:50', '23:50']

More info on (non-) capturing groups: http://www.regular-expressions.info/brackets.html

Sign up to request clarification or add additional context in comments.

2 Comments

Actually, findall() returns all groups or, if there are no groups, the entire match (to see this, add a second group around the minutes to the original expression and Python will return a tuple). Making the group non-capturing is the correct answer though.
@Blair, I was already looking for an explanation in the Python docs to find out the exact behavior. Thanks!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.