Binary Search on Sorted List with Duplicates

Question

To learn divide-and-conquer algorithms, I am implementing a function in Python called binary_search that will get the index of the first occurrence of a number in a non-empty, sorted list (elements of the list are non-decreasing, positive integers). For example, binary_search([1,1,2,2,3,4,4,5,5], 4) == 5, binary_search([1,1,1,1,1], 1) == 0, and binary_search([1,1,2,2,3], 5) == -1, where -1 means the number cannot be found in the list.

Below is my solution. Although the solution below passed all the tests I created manually it failed test cases from a black box unit tester. Could someone let me know what's wrong with the code below?

def find_first_index(A,low,high,key):
    if A[low] == key:
        return low
    if low == high:
        return -1
    mid = low+(high-low)//2
    if A[mid]==key:
        if A[mid-1]==key:
            return find_first_index(A,low,mid-1,key)  
        else:
            return mid
    if key <A[mid]:            
        return find_first_index(A,low,mid-1,key)    
    else: 
        return find_first_index(A, mid+1, high,key)
    
    
def binary_search(keys, number):
    index = find_first_index(A=keys, low=0,high=len(keys)-1,key=number)  
    return(index)

What does binary_search([], 1) return and what should it return? — mkrieger1
– mkrieger1, Commented Jan 5, 2022 at 2:07
you forget to include the mid position in your divide: find_first_index(A,low,mid-1,key) find_first_index(A,mid+1,high,key) — David
– David, Commented Jan 5, 2022 at 2:53
Hi David, I did take care of the midpoint. In my solution, first I wanted to find the midpoint of the list. Next, I checked if the midpoint is the first occurrence of number. If so, return the current position of the midpoint. Otherwise, if the midpoint were not the first occurrence of number, i.e., there exists some element number that precedes the midpoint, then I apply the same function recursively on the interval [low,midpoint-1]. — confused_programmer
– confused_programmer, Commented Jan 5, 2022 at 3:47
That convoluted way of calculating mid is unnecessary in Python, because Python integers can't overflow. Just use mid = (high+low) // 2. — Mark Ransom
– Mark Ransom, Commented Jan 5, 2022 at 4:57
Binary search is deceptively simple in theory but hideously hard in practice, see Are you one of the 10% of programmers who can write a binary search? — Mark Ransom
– Mark Ransom, Commented Jan 5, 2022 at 5:04

Riccardo Bucco · Accepted Answer · 2022-01-05 11:27:58Z

This should work:

def find_first_index(A, low, high, key):
    if A[low] == key:
        return low
    if low == high:
        return -1
    mid = low + (high - low) // 2
    if A[mid] >= key:
        return find_first_index(A, low, mid, key)
    return find_first_index(A, mid + 1, high, key)


def binary_search(keys, number):   
    return find_first_index(keys, 0, len(keys) - 1, number)

Your solution does not work, as you have already realized. For example, it breaks with the following input:

>>> binary_search([1, 5], 0)
...
RecursionError: maximum recursion depth exceeded in comparison

As you can see, the function does not even terminate, there's an infinite recursion going on here. Try to "run" your program on a piece of paper to understand what's going on (or use a debugger), it's very formative.

So, what's the error? The problem is that starting from some function call high < low. In this specific case, in the first function call low == 0 and high == 1. Then mid = 0 (because int(low + (high - low) / 2) == 0). But then you call find_first_index(A, low, mid - 1, key), which is basically find_first_index(A, 0, -1, key). The subsequent call will be exactly the same (because with low == 0 and high == -1 you will have again mid == 0). Therefor, you have an infinite recursion.

A simple solution in this case would be to have

if low >= high:
    return -1

Or just use my previous solution: checking mid - 1 in my opinion is not a good idea, or at least you must be much more careful when doing that.

Collectives™ on Stack Overflow

Binary Search on Sorted List with Duplicates

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related