Skip to main content
added 437 characters in body
Source Link
Ewan
  • 15.1k
  • 6
  • 50
  • 65

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse
import re

def check_and_add_http(url):
    # checks if 'http://' is present at the start of the URL and adds it if not.
    http_regex = re.compile(r'^http[s]?://')
    if http_regex.match(url):
        # 'http://' or 'https://' is present
        return url
    else:
        # add 'http://' for urlparse to work.
        return 'http://' + url

for url in url_list:
    url = check_and_add_http(url)
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse

for url in url_list:
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse
import re

def check_and_add_http(url):
    # checks if 'http://' is present at the start of the URL and adds it if not.
    http_regex = re.compile(r'^http[s]?://')
    if http_regex.match(url):
        # 'http://' or 'https://' is present
        return url
    else:
        # add 'http://' for urlparse to work.
        return 'http://' + url

for url in url_list:
    url = check_and_add_http(url)
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Source Link
Ewan
  • 15.1k
  • 6
  • 50
  • 65

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse

for url in url_list:
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.