DEV Community

Cover image for Pattern With Alphanumeric using Regex - NLP
datatoinfinity
datatoinfinity

Posted on

Pattern With Alphanumeric using Regex - NLP

An Alphanumeric string is the string that contain a-z,A-Z alphabet with 0-9 digit.

Now we will learn how to search pattern in alphanumeric string.

import re
print(re.findall('at','The rat sat on mat and attached by a cat'))
Output
['at', 'at', 'at', 'at', 'at']

It is giving us all 'at' present in the text.

Now we will use '.' a dot to get character before the pattern.

import re
print(re.findall('.at','The rat sat on mat and attached by a cat'))
['rat', 'sat', 'mat', ' at', 'cat']

If we want digit from text we can use '\d' to fetch it.

import re
print(re.findall('\d','On 2 June there is meeting 12 PM in room no. 5'))
Output:
['2', '1', '5']

To fetch last digit from text we use '$'

import re
print(re.findall('\d$','On 2 June there is meeting 1 PM in room no. 5'))
Output:
['5']

But if I remove ['5'] from text which is in last of string it will return empty list.

import re
print(re.findall('\d$','On 2 June there is meeting 1 PM in room no. '))
Output:
[]

When you use a number like '12' the '\d' print ['1','2'] differently.

import re
print(re.findall('\d','On 2 June there is meeting 12 PM in room no 5'))
['2', '1', '2', '5']

To solve this we will use '\d\d' but it will print the number only.

import re
print(re.findall('\d\d','On 2 June there is meeting 12 PM in room no 5'))
['12']

To fetch first digit from text we use '^'

import re
print(re.findall('^\d','2 June there is meeting 12 PM in room no 5'))
['2']

If you Generalise it the start and end pattern fetching digit, it only fetch which is at first or last if it is in between then it return empty list.

Fetch Only upper case alphabet.

import re
print(re.findall('[A-Z]','Data to Infinity'))
['D', 'I']

Fetch Only lower case alphabet.

import re
print(re.findall('[a-z]','Data to Infinity'))
['a', 't', 'a', 't', 'o', 'n', 'f', 'i', 'n', 'i', 't', 'y']

Fetch both upper case and lower case letter

import re
print(re.findall('[A-Za-z]','Data to Infinity'))
['D', 'a', 't', 'a', 't', 'o', 'I', 'n', 'f', 'i', 'n', 'i', 't', 'y']

Fetch upper case, lowercase and Number

import re
print(re.findall('[A-Za-z0-9]','Data to Infinity 19573'))
['D', 'a', 't', 'a', 't', 'o', 'I', 'n', 'f', 'i', 'n', 'i', 't', 'y', '1', '9', '5', '7', '3']

Top comments (0)