Revisions to Extract unique email addresses from a text file

Added \w as recommended by user @ Dewi Morgan.

Source Link

edited Jun 18 at 17:40

15.8k
6
29
217

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d, \w and the IGNORECASE flag:

email_pattern = r'[a-z\dr'[\w._%+%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d and the IGNORECASE flag:

email_pattern = r'[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d, \w and the IGNORECASE flag:

email_pattern = r'[\w.%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

added 68 characters in body

Source Link

edited Jun 17 at 16:45

toolic

15.8k
6
29
217

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d and the IGNORECASEIGNORECASE flag:

email_pattern = r'[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d and the IGNORECASE flag:

email_pattern = r'[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d and the IGNORECASE flag:

email_pattern = r'[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

added 80 characters in body

Source Link

edited Jun 17 at 16:34

toolic

15.8k
6
29
217

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d and the IGNORECASE flag:

email_pattern = r'[a-zA-Z\dz\d._%+-]+@[a-zA-Z\dz\d.-]+\.[a-zA-Z]z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d:

email_pattern = r'[a-zA-Z\d._%+-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}'

Documentation

The PEP 8 style guide recommends adding docstrings for functions. For example:

def extract_unique_emails(filename):
    """
    Extract unique email addresses from a text file.
    The input is a path to a test file.
    The function returns a list of email addresses.
    """

Simpler

These 2 lines:

unique_emails = sorted(set(emails))

return unique_emails

can be combined into 1 line:

return sorted(set(emails))

Similarly, these lines:

emails = extract_unique_emails("sample.txt")
for email in emails:

can be combined:

for email in extract_unique_emails("sample.txt"):

The regular expression:

email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

can be simplified using \d and the IGNORECASE flag:

email_pattern = r'[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,}'
emails = re.findall(email_pattern, content, re.IGNORECASE)

Source Link

answered Jun 17 at 16:21

toolic

15.8k
6
29
217

Loading

Stack Exchange Network

Return to Answer

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler

Documentation

Simpler