63

I have a string and I want to extract the numbers from it. For example:

str1 = "3158 reviews"
print (re.findall('\d+', str1 ))

Output is ['4', '3']

I want to get 3158 only, as an Integer preferably, not as List.

5
  • 15
    Output based on your code is ['3158']. Commented Nov 9, 2014 at 6:42
  • 10
    Then you must not be running the code shown above! Commented Nov 9, 2014 at 6:44
  • 1
    the fact that there is a 4 in your output but not in your input means you missed something else. the regex should be fine. Commented Dec 11, 2017 at 20:40
  • 6
    Seeing this question after 3 years makes me smile, I got so many upvotes for a stupid question, and yes I was running the wrong code, but I didnt had time to tell that in comments at that time. Commented Dec 12, 2017 at 16:38
  • 2
    what code were you running? Commented Oct 2, 2018 at 8:15

18 Answers 18

113

You can filter the string by digits using str.isdigit method,

>>> int(filter(str.isdigit, str1))
3158

For Python3:

int(list(filter(str.isdigit, my_str))[0])
Sign up to request clarification or add additional context in comments.

9 Comments

Doesn't seem to work for Python 3, presumably (though I'm not 100% sure) because in Python 3 filter is a class and calling it like you did returns a filter object which is an iterable, and int() can't cast a filter object into an int. It seems to me that there is no elegant way (such as in your answer, without using regex) to do this is Python 3. Is there?
@Ray I'm late here, but I bet other people will have this same problem. You can use int(list(filter(str.isdigit, my_str))[0]) for example. If filter returns an iterable, you just have to work with it :)
Adding to Juan's comment: int(''.join(list(filter(str.isdigit, my_str)))) if you want ALL the numbers in the string instead of just the first.
@apricity listis not needed: int(''.join(filter(str.isdigit, 'test3246')))
I'm not sure if this solution is good for the general case of number extraction. Consider using it on "3158 reviews 3158asdf 4".
|
41

This code works fine. There is definitely some other problem:

>>> import re
>>> str1 = "3158 reviews"
>>> print (re.findall('\d+', str1 ))
['3158']

1 Comment

Is that a list because the question says, I don't want a list.
23
IntVar = int("".join(filter(str.isdigit, StringVar)))

1 Comment

Unlike the accepted answer, this does work for Python3
20

Your regex looks correct. Are you sure you haven't made a mistake with the variable names? In your code above you mixup total_hotel_reviews_string and str.

>>> import re
>>> s = "3158 reviews"
>>> 
>>> print(re.findall("\d+", s))
['3158']

Comments

8

You were quite close to the final answer. Your re.finadall expression was only missing the enclosing parenthesis to catch all detected numbers:

re.findall( '(\d+)', str1 )

For a more general string like str1 = "3158 reviews, 432 users", this code would yield:

Output: ['3158', '432']

Now to obtain integers, you can map the int function to convert strings into integers:

A = list(map(int,re.findall('(\d+)',str1)))

Alternatively, you can use this one-liner loop:

A = [ int(x) for x in re.findall('(\d+)',str1) ]

Both methods are equally correct. They yield A = [3158, 432].

Your final result for the original question would be first entry in the array A, so we arrive at any of these expressions:

result = list(map(int,re.findall( '(\d+)' , str1 )))[0]

result = int(re.findall( '(\d+)' , str1 )[0])

Even if there is only one number present in str1, re.findall will still return a list, so you need to retrieve the first element A[0] manually.

Comments

7

To extract a single number from a string you can use re.search(), which returns the first match (or None):

>>> import re
>>> string = '3158 reviews'
>>> int(re.search(r'\d+', string).group(0))
3158

In Python 3.6+ you can also index into a match object instead of using group():

>>> int(re.search(r'\d+', string)[0])
3158

3 Comments

This only work in python2, or in python2 and python3?
@Matheus Moreover, the re module is available in 1.5 and later :)
The solution for universal code for both Python 2 and 3 is to use the .findall method instead of .search. This .findall method always returns a list with string items. The list will contain all the values found and therefore an index is required. It can be treated e.g. always retrieve the last item as the last index from behind: int(re.findall(r'\d+', string)[-1])
6

If the format is that simple (a space separates the number from the rest) then

int(str1.split()[0])

would do it

2 Comments

If the format is that simple YESSS ... I was passing a wrong variable... BTW ... you told me a shortest solution .. thanks
nice, though it only works if there is whitespace after the number ie for str1 = "3158 reviews" but not for str1 = "3158reviews"
6

Python 2.7:

>>> str1 = "3158 reviews"
>>> int(filter(str.isdigit, str1))
3158

Python 3:

>>> str1 = "3158 reviews"
>>> int(''.join(filter(str.isdigit, str1)))
3158

Comments

5

Best for every complex types

str1 = "sg-23.0 300sdf343fc  -34rrf-3.4r" #All kinds of occurrence of numbers between strings
num = [float(s) for s in re.findall(r'-?\d+\.?\d*', str1)]
print(num)

Output:

[-23.0, 300.0, 343.0, -34.0, -3.4]

Comments

4

Above solutions seem to assume integers. Here's a minor modification to allow decimals:

num = float("".join(filter(lambda d: str.isdigit(d) or d == '.', inputString)

(Doesn't account for - sign, and assumes any period is properly placed in digit string, not just some english-language period lying around. It's not built to be indestructible, but worked for my data case.)

1 Comment

Love it. Super-simple solution for a certain class of problems.
2

There may be a little problem with code from Vishnu's answer. If there is no digits in the string it will return ValueError. Here is my suggestion avoid this:

>>> digit = lambda x: int(filter(str.isdigit, x) or 0)
>>> digit('3158 reviews')
3158
>>> digit('reviews')
0

Comments

2

For python3

input_str = '21ddd3322'
int(''.join(filter(str.isdigit, input_str)))

> 213322

Comments

1
a = []
line = "abcd 3455 ijkl 56.78 ij"
for word in line.split():
 try:
  a.append(float(word))
  except ValueError:
  pass
print(a)

OUTPUT

3455.0 56.78

Comments

1

I am a beginner in coding. This is my attempt to answer the questions. Used Python3.7 version without importing any libraries.

This code extracts and returns a decimal number from a string made of sets of characters separated by blanks (words).

Attention: In case there are more than one number, it returns the last value.

line = input ('Please enter your string ')
for word in line.split():
    try:
        a=float(word)
        print (a)
    except ValueError:
        pass

Comments

1

My answer does not require any additional libraries, and it's easy to understand. But you have to notice that if there's more than one number inside a string, my code will concatenate them together.

def search_number_string(string):
    index_list = []
    del index_list[:]
    for i, x in enumerate(string):
        if x.isdigit() == True:
            index_list.append(i)
    start = index_list[0]
    end = index_list[-1] + 1
    number = string[start:end]
    return number

3 Comments

Why are you: del index_list[:]?
I just want to make sure the index_list is empty
does it work, if there are multiple digits, separated by different letteres like : 'Repeats\t70559.1:2\t2579485.5\n'. like if I want 70559.1:2 in one list and 2579485.5 in another, considering that the number of digits might change in this string ?
0

you can use the below method to extract all numbers from a string.

def extract_numbers_from_string(string):
    number = ''
    for i in string:
        try:
            number += str(int(i))
        except:
            pass
    return number

(OR) you could use i.isdigit() or i.isnumeric(in Python 3.6.5 or above)

def extract_numbers_from_string(string):
    number = ''
    for i in string:
        if i.isnumeric():
            number += str(int(i))
    return number


a = '343fdfd3'
print (extract_numbers_from_string(a))
# 3433

Comments

0

Using a list comprehension and Python 3:

>>> int("".join([c for c in str1 if str.isdigit(c)]))
3158

1 Comment

The list isn't necessary
0

Use this, THIS IS FOR EXTRACTING NUMBER FROM STRING IN GENERAL.

To get all the numeric occurences.

  • split function to convert string to list and then the list comprehension which can help us iterating through the list and is digit function helps to get the digit out of a string.

getting number from string, use list comprehension+isdigit()

test_string = "i have four ballons for 2 kids"

# list comprehension + isdigit() +split()

res = [int(i) for i in test_string.split() if i.isdigit()]
print("The numbers list is : "+ str(res))

To extract numeric values from a string in python

  • Find list of all integer numbers in string separated by lower case characters using re.findall(expression,string) method.

  • Convert each number in form of string into decimal number and then find max of it.

import re 
def extractMax(input):

    # get a list of all numbers separated by lower case characters
    # \d+ is a regular expression which means one or more digit
    numbers = re.findall('\d+',input) 
    
    number = map(int,numbers)
    return max(numbers)

if __name__=="__main__":
    input = 'sting'
    print extractMax(input)

1 Comment

This answer is overly-long, and confusingly worded. Code should usually be short executable snippets, not complete scripts (i.e. checking for __name__=="__main__" is just a distraction.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.