2
\$\begingroup\$

There are seven sorting algorithms in the code copied below.

The first five algorithms have been previously reviewed in this link.

Selection Sort

The Selection Sort algorithm sorts a list by finding the element with minimum value from the right (unsorted part) of the list and putting it at the left (sorted part) of the list.

Bubble Sort

The Bubble Sort algorithm repeatedly swaps the adjacent elements of an input list using two for loops, if they aren't in correct order.

Efficient Bubble Sort

A might-be slightly efficient version of Bubble Sort algorithm is to break the outer loop, when there is no further swapping to be made, in an entire pass. For example, if the list has 10 million elements, it is possible that in the outer for loop, at pass 10,000 for instance, there would be no further swapping required to be made, if the array has been already sorted, thus the rest of the loop would become unnecessary to continue.

Insertion Sort

Insertion Sort algorithm builds the final sorted array in a one element at a time manner. It is less efficient on large lists than more advanced algorithms, such as Quick Sort, Heap Sort or Merge Sort, yet it provides some advantages such as implementation simplicity, efficiency for small datasets, and sorting stability.

Shell Sort

Shell Sort is just a variation of Insertion Sort. In Insertion Sort, when an element has to be moved far ahead, too many movements are involved, which is a drawback. In Shell Sort, we'd make the array "h-sorted" for a large value of h and then keep reducing the value of h (sublist_increment) until it'd become 1. In Shell Sort, selecting odd numbers for "h-sorting" would not be the best idea, since there'd be more overlaps, as compared to even numbers. In the following implementation, sublist_increment was an odd number.

Efficient Shell Sort

In Shell Sort, the selection of h values is important. For instance, [9, 6, 3, 1] are not proper values for h, since 3, 6, and 9 overlap. A list of prime numbers, such as [7, 5, 3, 1], would be much efficient for Shell Sort algorithm.

Merging Two Lists into a New List

In this algorithm, we would first sort two lists using one of the above in-place sorting methods, then we'd create a new list, compare the lists elements, and finally we'd place them into the new list using three simple loops:

  • if both lists have elements to be compared
  • if list 1 has elements left to be placed in the new list
  • if list 2 has elements left to be placed in the new list

I've been trying to implement the above algorithms in Python, just for practicing, and modified them based on prior reviews (as much as I could succeed to do so), I'd appreciate it if you'd review any part of it for any other small or big changes/improvements/recommendations.

import random
from typing import List, TypeVar
from scipy import stats

T = TypeVar('T')


def selection_sort(input_list: List[T]) -> List[T]:
    """
    This method gets an integer/float list and returns 
    an ascendingly sorted integer/float list using Selection Sort Algorithm.

    Attributes:
    - In-Place Sort: Space Complexity O(1)
    - Efficiency: Time Complexity => O(N^2)
    - Unstable Sort: Order of duplicate elements is not preserved

    Iterates through the list and swaps the min value found from the right unsorted side
    of the list with the sorted elements from the left side of the list.
    """

    # Is the length of the list.
    length = len(input_list)

    # Iterates through the list to do the swapping.
    for element_index in range(length - 1):

        min_index = element_index

        # Iterates through the list to find the min index.
        for finder_index in range(element_index + 1, length):
            if input_list[min_index] > input_list[finder_index]:
                min_index = finder_index

        # Swaps the min value with the pointer value.
        if element_index is not min_index:
            _swap_elements(input_list, element_index, min_index)

    return input_list


def bubble_sort(input_list: List[T]) -> List[T]:
    """
    This method gets an integer/float list and returns 
    an ascendingly sorted integer/float list using regular Bubble Sort algorithm.

    Attributes:
    - In-Place Sort: Space Complexity => O(1)
    - Efficiency: Time Complexity => O(N^2)
    - Stable Sort (Order of equal elements does not change)
    """

    length = len(input_list)
    for i in range(length - 1):
        for j in range(length - i - 1):
            if input_list[j] > input_list[j + 1]:
                _swap_elements(input_list, j, j + 1)

    return input_list


def efficient_bubble_sort(input_list: List[T]) -> List[T]:
    """
    This method gets an integer/float list and returns 
    an ascendingly sorted integer/float list 
    using a slightly efficient Bubble Sort algorithm.

    For optimization, the Bubble Sort algorithm stops, if in a pass, 
    there would be no further swaps between an element of the array and the next element.

    Attributes:
    - In-Place Sort: Space Complexity => O(1)
    - Efficiency: Time Complexity => O(N^2)
    - Stable Sort (Order of equal elements does not change)
    """

    # Assigns the length of to be sorted array.
    length = len(input_list)

    for i in range(length - 1):
        number_of_swaps = 0
        for j in range(length - i - 1):
            if input_list[j] > input_list[j + 1]:
                _swap_elements(input_list, j, j + 1)
                number_of_swaps += 1

        # If there is no further swaps in iteration i, the array is already sorted.
        if number_of_swaps == 0:
            break

    return input_list


def _swap_elements(input_list: List[T], index1: int, index2: int) -> None:
    """
    Swaps the adjacent elements of the input list.
    """
    input_list[index1], input_list[index2] = input_list[index2], input_list[index1]


def insertion_sort(input_list: List[T]) -> List[T]:
    """
    This method gets an integer/float list and returns 
    an ascendingly sorted integer/float list using Shell Sort algorithm.

    Attributes:
    - In-Place: Space Complexity O(1)
    - Efficiency (Time Complexity O(N^2) 
    - Good if N is small 
    - It has too many movements
    - Stable Sort (Order of duplicate elements is preserved)
    """

    # Assigns the length of to be sorted array.
    length = len(input_list)

    # Picks the to-be-inserted element from the right side of the array, starting with index 1.
    for i in range(1, length):
        element_for_insertion = input_list[i]

        # Iterates through the left sorted-side of the array to find
        # the correct position for the element to be inserted.
        j = i - 1
        while j >= 0 and input_list[j] > element_for_insertion:
            input_list[j + 1] = input_list[j]
            j -= 1

        # Inserts the element.
        input_list[j + 1] = element_for_insertion

    return input_list


def shell_sort(input_list: List[T], sublist_increment: int = 5) -> List[T]:
    """
    This method gets an integer/float list and returns 
    an ascendingly sorted integer/float list using Insertion Sort algorithm.

    Attributes:
    - In-Place: Space Complexity O(1)
    - Efficiency (Time Complexity O(N*(log N)^2 ) or O(N^1.25)
    - Good if N is large 
    - It reduces the number of movements as compared to Insertion Sort
    - Unstable Sort: Order of duplicate elements is not preserved
    """
    try:
        if sublist_increment // 2 == 0:
            return
    finally:

        # Assigns the length of to be sorted array.
        length = len(input_list)

        while sublist_increment >= 1:

            for i in range(sublist_increment, length):
                element_for_insertion = input_list[i]

                # Iterates through the left sorted-side of the array to find
                # the correct position for the element to be inserted.
                j = i - sublist_increment
                while j >= 0 and input_list[j] > element_for_insertion:
                    input_list[j + sublist_increment] = input_list[j]
                    j -= sublist_increment

                # Inserts the element.
                input_list[j + sublist_increment] = element_for_insertion

            # Narrows down the sublists by two increments.
            sublist_increment -= 2

        return input_list


def efficient_shell_sort(input_list: List[T]) -> List[T]:
    """
    This method gets an integer/float list and returns 
    an ascendingly sorted integer/float list using Insertion Sort algorithm.

    Here, we would use prime numbers, 
    somewhat distributed relative to the length of list to be sorted,
    such that we'd have optimal number of sublists and movements.

    Attributes:
    - In-Place: Space Complexity O(1)
    - Efficiency (Time Complexity O(N*(log N)^2 ) or O(N^1.25) 
    - Good if N is large 
    - It reduces the number of movements as compared to Insertion Sort
    - Unstable Sort: Order of duplicate elements is not preserved
    """

    # Assigns the length of to be sorted array.
    length = len(input_list)

    # Assigns a list of prime numbers larger than three
    # as well as one, in descending order, for sublist increments of Shell Sort.
    sublist_increments = prime_numbers_and_one(length)[::-1]

    for sublist_increment in sublist_increments:

        for i in range(sublist_increment, length):
            element_for_insertion = input_list[i]

            # Iterates through the left sorted-side of the array to find
            # the correct position for the element to be inserted.
            j = i - sublist_increment
            while j >= 0 and input_list[j] > element_for_insertion:
                input_list[j + sublist_increment] = input_list[j]
                j -= sublist_increment

            # Inserts the element.
            input_list[j + sublist_increment] = element_for_insertion

    return input_list


def merge_two_sorted_lists(list1: List[T], list2: List[T]) -> List[T]:
    """
    This method sorts two integer/float lists first, then it'd merge them into a new list.

    Attributes:
    - Initial In-Place Sorting (Space Complexity O(1) = O(1) + O(1))
    - Secondary Not-In-Place Sorting (Space Complexity O(N+M) = O(N) + O(M))
    - Efficiency (Experimental Time Complexity O(N*(log N)^2 ) or O(N^1.25)
    - Good if N is large 
    - It reduces the number of movements as compared to Insertion Sort
    - Stable Sort: Order of duplicate elements would be preserved
    """

    # Sorts both arrays using for instance Optimized Shell Sort.
    efficient_shell_sort(list1)
    efficient_shell_sort(list2)

    # Assigns the lengths of two lists.
    length1, length2 = len(list1), len(list2)

    # Increments for the two lists and the third output list.
    i = j = k = 0

    # Creates a new list with size of lists one and two.
    merged_list = [None] * (length1 + length2)

    # If both lists are have elements to be inserted in the new merged array.
    while i <= length1 - 1 and j <= length2 - 1:
        if list1[i] < list2[j]:
            merged_list[k] = list1[i]
            i += 1
        else:
            merged_list[k] = list2[j]
            j += 1
        k += 1

    # If list one has elements to be inserted in the new merged array,
    # and list two is already done.
    while i <= length1 - 1:
        merged_list[k] = list1[i]
        i += 1
        k += 1

    # If list two has elements to be inserted in the new merged array,
    # and list one is already done.
    while j < length2 - 1:
        merged_list[k] = list1[j]
        j += 1
        k += 1

    return merged_list


def prime_numbers_and_one(array_length: int = 5, prime_numbers=[1]) -> List[T]:
    """
    This method returns a list of prime numbers larger and equal than three
    in addition to one, such as:
    [1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
    """
    if array_length <= 1:
        return prime_numbers

    number = 3
    while len(prime_numbers) in range(array_length):
        i = 2
        count_divisibles = 0
        for i in range(2, number):
            # If it is not a prime number:
            if number % i == 0:
                count_divisibles += 1
                break
            i += 1

        # If it is a prime number:
        if count_divisibles == 0:
            prime_numbers.append(number)
        number += 1

    return prime_numbers


if __name__ == "__main__":

    # Creates a dash line string and a new line for in between the tests.
    delimiter = "-" * 70 + "\n"

    # Generates a random integer list.
    TEST_LIST_INTEGER = random.sample(range(-100, 100), 15) * 3
    print(f"""The unsorted integer array is:
        {TEST_LIST_INTEGER}""")
    print(delimiter)

    # Generates a random float list.
    TEST_LIST_FLOAT = stats.uniform(0, 100).rvs(45)
    print(f"""The unsorted float array is:
        {TEST_LIST_FLOAT}""")
    print(delimiter)

    # Sample float/integer test list for input.
    INTEGER_FLOAT_INPUT = list(TEST_LIST_INTEGER + TEST_LIST_FLOAT)

    # Sample float/integer test list for output.
    INTEGER_FLOAT_OUTPUT = sorted(INTEGER_FLOAT_INPUT)

    sorting_algorithms = [
        ("Selection Sort", selection_sort),
        ("Bubble Sort", bubble_sort),
        ("Efficient Bubble Sort", efficient_bubble_sort),
        ("Insertion Sort", insertion_sort),
        # Wrap shell_sort into a lambda to make it a single-argument function for testing
        ("Shell Sort", lambda s: shell_sort(s, 5)),
        ("Efficient Shell Sort", efficient_shell_sort)
    ]

    # Testing
    for description, func in sorting_algorithms:
        if (func(INTEGER_FLOAT_INPUT.copy()) == INTEGER_FLOAT_OUTPUT):
            print(f"{description} Test was Successful.")
        else:
            print(f"{description} Test was not Successful.")
        print(f"""{description} (Integer):
            {func(TEST_LIST_INTEGER.copy())}""")
        print(f"""{description} (Float):
            {func(TEST_LIST_FLOAT.copy())}""")
        print(delimiter)

    print(f"""Merging and sorting float and integer lists:\n
        {merge_two_sorted_lists(TEST_LIST_INTEGER, TEST_LIST_FLOAT)}""")

References

\$\endgroup\$
1
  • 2
    \$\begingroup\$ Is this an update on the previous question? If so, I would state so in the question. \$\endgroup\$ Commented Sep 26, 2019 at 17:22

1 Answer 1

2
\$\begingroup\$

A few points:

  • As mentioned in the previous review, the lambda expression in lambda s: shell_sort(s, 5) is no longer needed once the second parameter of shell_sort has a default value, since the function can be called by shell_sort(input_list) just like other functions. Therefore using shell_sort is sufficient.

  • This piece of code is not correctly written.

    def shell_sort(input_list: List[T], sublist_increment: int = 5) -> List[T]:
        try:
            if sublist_increment // 2 == 0:
                return
        finally:
            ...
    

    It should be like this.

    def shell_sort(input_list: List[T], sublist_increment: int = 5) -> List[T]:
        # `//` is floor division so this is the equivalent form.
        # I am not sure whether the logic is actually correct or not.
        # Maybe it should just be `sublist_increment < 2` instead.
        if 0 <= sublist_increment < 2:
            raise ValueError(" ... error message ...")
    
        ... remaining code ...
    
  • As suggested by others in the previous review, the functions modify the input in-place. So it is better not to return a list (simply omit the return statements). And it is called this way:

    list_items = ...
    func(list_items)
    
    ... list_items holds the output so it can be used directly ...
    
  • In small programs, test cases can be better organized as a list or tuple and iterated over during tests, similarly as the tested functions. It makes adding new tests cases (either manually crafted or automatically generated) easier. For larger projects one would need other tools such as pytest.

    GENERATED_INTEGER_TEST = [random.randint(-100, 100) for _ in range(50)]  # `_` represents a don't-care variable
    GENERATED_FLOAT_TEST = [random.uniform(-10, 10) for _ in range(50)]
    
    test_cases = (
        ["Test 1 (Normal)", [10, 45, 20, 30, ....]],
        ["Test 2 (Sorted list)", [10, 20, 30]],
        ["Test 3 (Reverse ordered list)", [0, -10, -24, -33]],
        ["Test 4 (Randomly generated integers)", GENERATED_INTEGER_TEST],
        ....
        ["Test .... (Randomly generated floats)", GENERATED_FLOAT_TEST]
    )
    
    # Add expected output
    for test_case in test_cases:
        test_case.append(sorted(test_case[1]))
    
    ...
    
    # Actual testing
    for func_description, func in sorting_algorithms:
        print("Testing", func_description)
        for test_description, test_input, expected_output in test:
            output = test_input[:]
            func(output)
    
            message = "passed" if output == expected_output else "failed"
            print(test_description, message)
    
            ... print inputs and outputs if needed, using `test_input` and `output` ...
    

    Also note that test cases need to be designed to cover different kinds of inputs that go through different code branches, including edge cases that can possibly lead to bugs. Here, tests on floats would succeed as long as corresponding integer tests succeed. So there is no need to repeat every test for both integer and floats. In other words, as long as the comparision operators are well-defined, the type of inputs is not a feature that can lead to different behaviour of the tested functions. You need to look for other variations instead, as shown in the sample code above.

    As a side remark, the sample code also demonstrates generating random numbers using the random module so scipy is no longer needed.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ @Emma (1) It's fine to keep lambdas for calls with non-default arguments, but I think you could also add one extra test for shell_short with default arguments (so no lambda is needed); (2) if it is intended to fail for even numbers, the condition should be if sublist_increment % 2 == 0; (3) to remove returns, the functions have to be called as shown in my answer (i.e. make a copy of the test input, call function, and compare the in-place modified list with expected result). The sample code is provided exactly because I assumed that you failed to figure out how to correct it. \$\endgroup\$ Commented Sep 28, 2019 at 21:56

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.