Skip to main content
4 of 4
deleted 83 characters in body; edited tags; edited title
200_success
  • 145.6k
  • 22
  • 191
  • 481

Interpret a date from a string of digits

I developed a function that, from a given sequence of digits, extracts the date and reformat it.
This is the code:

from datetime import datetime as dt


def format_dates(field):
    n = len(field)
    match = False
    i = 0
    while match is False:
        try:
            # Take the last four digits
            year = int(field[-4 - i:n - i])
        except ValueError:
            return ''

        # Check if this year is between today's year +/- (15, 100)
        if (1919 <= year <= 2019):
            # Check if there are other 4 digits before these 4 ones
            if (len(field[-8 - i:n - i]) == 8):
                try:
                    f_date = dt.strptime(field[-8 - i:n - i],
                                         '%d%m%Y').strftime('%d/%m/%Y')
                    match = True
                    return f_date
                except ValueError:
                    pass
            else:
                return ''
        i += 1

Explanation:
This function:

  • Takes a sequence of digits as input.

  • extracts the last four digits from that sequence.

  • Checks if the extracted four digits are between 2019 and 1919, if not, it breaks.

  • If yes, it checks if there are more 4 digits before the previously extracted ones, if not it breaks.

  • If yes, it tries to format the whole 8 digits.

  • If there is a ValueError exception, it passes (ValueError, means there are 8 digits, the last four of them represent a correct year, but the fist four digits are wrong. So it passes to increment i + 1 to add a the next digits in the front and remove the last digit in the processed sequence).

Example:

input: '1303201946'

  1. Iteration 1:
  • i = 0, match = False
  • year = 1946
  • test 1 (year between 2019 and 1919): passes.
  • test2 (there are 4 other digits before 1946, which are 0320): passes.
  • format the whole 8 digits: ValueError exception, so i = i+1 and pass to the next iteration.
  1. Iteration 2:
  • i = 1, match = False
  • year = 0194
  • test 1 (year between 2019 and 1919): fails, so i = i + 1 and pass to the next iteration.
  1. Iteration 3:
  • i = 2, match = False
  • year = 2019
  • test 1: passes
  • test 2: passes
  • format the whole 8 digits (13032019): 13/03/2019 (No ValueError exception) passes
  • match = True, return the formatted date, break from the while loop.

This function works fine, but the way it handles the errors seems ugly. Also I believe it is not optimized (same exceptions are repeated, a lot of returns and the code does not seem elegant).
How to reformat the code and make it more optimized?

singrium
  • 327
  • 1
  • 4
  • 12