Python regex pattern which search for domain name

Question

I got a list of links and some of them look like

https://www.domainname
or https://domainname

I need to make a regex pattern to get only the domain name from it. This "www" make problems in my pattern :(

print(re.findall("//([a-zA-Z]+)", i))

You can create an optional non-capturing group - re.findall(r"//(?:www\.)?([a-zA-Z]+)", i) — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Sep 2, 2022 at 13:22

LetzerWille · Accepted Answer · 2022-09-02 13:59:56Z

0

You could use the end of the string.

url = "https://www.domainname"
url2 = "https://domainname"


for u in [url, url2]:
    print(f'{u}')
    print(re.findall(r"\w+$", url2))

https://www.domainname
['domainname']
https://domainname
['domainname']

answered Sep 2, 2022 at 13:59

LetzerWille

5,6965 gold badges26 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Shahab Rahnama · Accepted Answer · 2022-09-02 14:07:11Z

0

My solution:

import re

l1 = ["https://www.domainname1", "https://domainname2"]
for i in l1:
    print(re.findall("/(?:www\.)?(\w+)", i))

Output:

['domainname1']
['domainname2']

answered Sep 2, 2022 at 14:07

Shahab Rahnama

1,0321 gold badge8 silver badges14 bronze badges

Comments

score 0 · Accepted Answer · 2022-09-02 14:20:29Z

0

import re

with open('testfile.txt', 'r') as file:
    readfile = file.read()

    search = re.finditer('(?:\w+:\/\/)?(?:\w+\.)(\w+)(\.\w+)', readfile)

    for check in search:
        print(check.group(1)) #type 1 : if you want only domain names

result :

domainname
example

edited Sep 2, 2022 at 14:20

answered Sep 2, 2022 at 14:05

user19789236

Collectives™ on Stack Overflow

Python regex pattern which search for domain name

3 Answers 3

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Linked

Related