Extract number from string in python without re

Question

How can I extract a number from a string in python without having to use regex? I have seen isinstance but the number could change to almost anything. Any ideas?

https://www.investopedia.com/articles/retirement/?page=6

Well, you're just reading from a query string in that case... parse it out and read the page parameter... — Jeff Mercado
– Jeff Mercado, Commented Sep 19, 2018 at 23:34

wim · Accepted Answer · 2018-09-19 23:37:41Z

2

It's a bit verbose, but I would use url parsing for this. The advantage overy regex is that you would get some input validation for free, and more readable code.

>>> from urllib.parse import urlparse, parse_qs
>>> url = 'https://www.investopedia.com/articles/retirement/?page=6'
>>> parsed = urlparse(url)
>>> query = parse_qs(parsed.query)
>>> [page] = query['page']
>>> int(page)
6

answered Sep 19, 2018 at 23:37

wim

368k112 gold badges680 silver badges816 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Dani Mesejo · Accepted Answer · 2018-09-19 23:45:33Z

2

You can extract continuous groups of digits, anywhere on the string, using the following:

from itertools import groupby

url = 'https://www.investopedia.com/articles/retirement/?page=6&limit=10&offset=15'
print([int(''.join(group)) for key, group in groupby(iterable=url, key=lambda e: e.isdigit()) if key])

Output

[6, 10, 15]

answered Sep 19, 2018 at 23:45

Dani Mesejo

62.2k6 gold badges56 silver badges86 bronze badges

Comments

yvesva · Accepted Answer · 2018-09-20 01:40:16Z

1

This assumes that there isn't multiple blocks of integers (e.g. www.something212.com/page=?13)

You could try using list comprehensions and str.isdigit()

url = 'https://www.investopedia.com/articles/retirement/?page=6'

digits = [d for d in url if d.isdigit()]

digit = ''.join(digits)

digit
>>> 6

Edited: now works with digits above 9

edited Sep 20, 2018 at 1:40

answered Sep 19, 2018 at 23:45

yvesva

7607 silver badges11 bronze badges

15 Comments

vash_the_stampede Over a year ago

what happens if that 6 is 12?

yvesva Over a year ago

digits would produce [6,12]. You could join the answer by using number = ''.join(map(int, digits))

vash_the_stampede Over a year ago

I know, I'm saying why not address that in your answer?

vash_the_stampede Over a year ago

you could just ''.join(digits) since you already know whats in there

Kamikaze_goldfish Over a year ago

@vash_the_stampede I’m gonna have to agree with you. This is a pretty good bit of code.

|

vash_the_stampede · Accepted Answer · 2018-09-19 23:39:23Z

1

If the url always has that format with only digits at the end you could do this:

s = 'https://www.investopedia.com/articles/retirement/?page=25'
new = []
k = list(s)
[new.append(i) for i in k if i.isdigit()]
print(''.join(new))

(xenial)vash@localhost:~/python/stack_overflow$ python3.7 isdigit.py
25

answered Sep 19, 2018 at 23:39

vash_the_stampede

4,5861 gold badge11 silver badges20 bronze badges

Comments

yoonghm · Accepted Answer · 2018-09-20 04:35:30Z

0

I know you do not need re, but it is actually very powerful. Under the hood, most libraries make use of re. Here is my solution to handle this situation:

import re

url = "www.fake888.com/article/?article=123&page=9&group=8"

numbers = re.findall(r'(?<==)(\d+)', url)
print(f'Found: {" ".join(numbers)}')

varval = re.findall(r'(\w+)=(\d+)', url)
urldict = {}
for var in varval:
  urldict[var[0]] = var[1]

print(urldict)

The output is

Found: 123 9 8
{'article': '123', 'page': '9', 'group': '8'}

answered Sep 20, 2018 at 4:35

yoonghm

4,6851 gold badge40 silver badges53 bronze badges

1 Comment

Kamikaze_goldfish Over a year ago

I’m gonna have to check that out! Thanks for the link. I really have wanted to learn regex for a while because of its power. 💪

Collectives™ on Stack Overflow

Extract number from string in python without re

5 Answers 5

Comments

Comments

15 Comments

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

15 Comments

Comments

1 Comment

Related