-1

I have python code like this

#! /usr/bin/python
from url parse import urlparse
url = 'https://pastebin.com/raw/EgGZmEqY'
parsed = urlparse(url)
site = parsed.netloc
print site

I want if the site is RAW or NOT just Grabbing the site without HTTPS and HTTP or WWW. For Example i have website like this from RAW. I want to get the URL just example.com without

https://example.com
http://example.com
www.example.com
example.com

How to get without https,http and www ? Thank you!

0

1 Answer 1

1

I take it that you just want the TLD (domain name) without the subdomains or scheme.

From this Stackoverflow answer, seems all you need is:

import tldextract
tldextract.extract('http://forums.news.cnn.com/')
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com') 

In your case then, i would use this: #!/usr/bin/env python3

import tldextract

url = 'https://www.pastebin.co.uk/raw/EgGZmEqY'

parsed = tldextract.extract(url)
domain = parsed.domain + '.' + parsed.suffix



print (domain)
Sign up to request clarification or add additional context in comments.

2 Comments

You should provide code which works with the OP's exact data. Cutting and pasting from another question doesn't help much.
But that just for one domain .. how i want grab it from raw / another website ? like in my pastebin link.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.