Return to Answer

added 437 characters in body

Source Link

edited Jun 9, 2013 at 6:36

15.1k
6
50
65

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse
import re

def check_and_add_http(url):
    # checks if 'http://' is present at the start of the URL and adds it if not.
    http_regex = re.compile(r'^http[s]?://')
    if http_regex.match(url):
        # 'http://' or 'https://' is present
        return url
    else:
        # add 'http://' for urlparse to work.
        return 'http://' + url

for url in url_list:
    url = check_and_add_http(url)
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse

for url in url_list:
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse
import re

def check_and_add_http(url):
    # checks if 'http://' is present at the start of the URL and adds it if not.
    http_regex = re.compile(r'^http[s]?://')
    if http_regex.match(url):
        # 'http://' or 'https://' is present
        return url
    else:
        # add 'http://' for urlparse to work.
        return 'http://' + url

for url in url_list:
    url = check_and_add_http(url)
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Source Link

answered Jun 7, 2013 at 11:53

Ewan

15.1k
6
50
65

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse

for url in url_list:
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

Collectives™ on Stack Overflow

Return to Answer