13

I was going through the analysis of quicksort in Sedgewick's Algorithms book. He creates the following recurrence relation for number of compares in quicksort while sorting an array of N distinct items.

enter image description here

I am having a tough time understanding this... I know it takes 1/N probability for any element to become the pivot and that if k becomes the pivot, then the left sub-array will have k-1 elements and right sub-array will have N-k elements.

1.How does the cost of partitioning become N+1 ? Does it take N+1 compares to do the partitioning?

2.Sedgewick says, for each value of k, if you add those up, the probability that the partitioning element is k + the cost for the two sub-arrays you get the above equation.

  • Can someone explain this so that those with less math knowledge (me) can understand?
  • Specifically how do you get the second term in the equation?
  • What exactly does that term stand for?
1
  • 1
    Part of the answer, copied from en.wikipedia.org/wiki/Quicksort "So, averaging over all possible splits and noting that the number of comparisons for the partition is n - 1, the average number of comparisons over all permutations of the input sequence can be estimated accurately by solving the recurrence relation:" For some reason we are off by 2 here - n-1 vs n+1. Commented Aug 5, 2013 at 20:31

3 Answers 3

7
+100

The cost function C for quicksort consists of two parts. The first part is the cost of partitioning the array in two 'halves' (the halves don't have to be of equal size, hence the quotes). The second part is the cost of sorting those two halves.

  1. The (N + 1) term is actually a condensed term, and comes from the terms

    (N - 1) + 2
    

    This is the cost of the partitioning in quicksort: N-1 compares with the pivot value, and 2 additional compares due to some boundary conditions in the partitioning.

  2. The second part of the equation consists of the costs for sorting the two 'halves' on either side of the pivot value k.

    After choosing a pivot value, you are left with two unsorted 'halves'. The cost of sorting these 'halves' depends on their size and is easiest described as a recursive application of the cost function C. If the pivot is the smallest of the N values, the costs for sorting each of the two 'halves' is respectively C(0) and C(N-1) (the cost for sorting an array with 0 elements and the cost for sorting one with N-1 elements). If the pivot is the fifth-smallest, then the cost for sorting each of the two 'halves' is respectively C(5) and C(N-6) (the cost for sorting an array with 5 elements and the cost for sorting one with N-6 elements). And similarly for all other pivot values.

    But how much does it cost to sort those two 'halves' if you don't know the pivot value? This is done by taking the cost for each possible value of the pivot and multiplying that by the chance that that particular value turns up.

    As each pivot value is equally likely, the chance for choosing a particular pivot value is 1/N if you have N elements. To understand this, think about rolling a dice. With a proper dice, the chance for each side to end up facing up is equal, so the chance to roll a 1 is 1/6.

    Combined, this gives the summation term where, for each possible value k of the pivot, the cost (C(k-1) + C(N-k)) is multiplied by the chance (1/N)

  3. The further derivation of the summation formulat in the question to the 2N lnN in the title takes too much math to explain herein detail, but it is based on the understanding that the cost for sorting an array of N elements (C(N)) can be expressed in terms of sorting an array of N-1 elements (C(N-1)) and a factor that is directly proportional to N.

2
  1. It seems that N+1 as the number of comparisons for the partition step is an error in the book. You need to find out for each of the N–1 non-pivot elements whether it is less than or greater than the pivot, which takes one comparison; thus N–1 comparisons in total, not N+1. (Consider the simplest case, N=2, i.e. one pivot and one other element: There is absolutely no room for doing three comparisons between two elements.)

  2. Consider the case where the chosen pivot happens to be the smallest element (k=1). This means that the array is divided into an empty part to the left (there are no elements that are less than the pivot) and a part to the right that contains all the elements except for the pivot (all other elements are greater than the pivot). This means that the sub-problems that you now want to solve recursively have sizes 0 and N–1 (k–1 and N–k), respectively, and require C(0) and C(N–1) comparisons; thus, C(0)+C(N–1) in total.

    If the pivot happens to be the second smallest element (k=2), the sub-problem sizes are 1 and N–2 (k–1 and N–k; one element on the left, because it is the only one smaller than the pivot). Thus, recursively solving these sub-problems requires C(1)+C(N–2) comparisons. And so on if the pivot is the third smallest element, the fourth, etc. These are the expressions in the numerators.

    Because the pivot is chosen randomly from among the N elements, each case (pivot is smallest, pivot is second smallest, etc.) occurs with equal probability 1/N. That’s where the N in the denominators comes from.

0

N + 1 is not an error.  N - 1  compares against the pivot are necessary to examine all array elements. Additional one/two compares come from pointer crossing. There are three cases:

  1. Pivot is the largest element:
                                  ⬅ j
                                    ⯆
    ┌───┬───────────────────┬─────┐
    │ p │        < p        │ ≤ p │
    └───┴───────────────────┴─────┘
                               ⯅
                               i
    
    j points past the end of the array. When it's decremented it checks the element pointed to by i one more time. N - 1 + 1 = N compares in total.
  2. Pointers cross after advancing i:
                         ⬅ j
                           ⯆
    ┌───┬─────────┬─────┬─────┬────
    │ p │         │ ≤ p │ ≥ p │
    └───┴─────────┴─────┴─────┴────
                     ⯅
                     i ➡
    
    j's element was previously swapped and therefore already examined by i. When i advances (from < after a simple increment or from ≤ after a swap), it examines j's element once again. Then j is decremented and it checks an element previously pointed to by i. These 2 extra compares bring the total to N - 1 + 2 = N + 1.
  3. Pointers cross after advancing j:
                         ⬅ j
                           ⯆
    ┌───┬─────────┬─────┬─────┬────
    │ p │   ≤ p   │ ≥ p │ ≥ p │
    └───┴─────────┴─────┴─────┴────
                     ⯅
                     i
    
    j is decremented at least once to check i's element. If i stopped at an element strictly greater than the pivot, j is decremented a second time.

All in all the total number of compares is either N or N + 1. If all elements are distinct, N compares happen only when the pivot is the largest element.

The average number of compares when all elements are distinct is

1       N - 1                   1
─ × N + ───── (N + 1) = N + 1 - ─
N         N                     N

1 / N correction accounts for the (C_{N - 1}, C_0) split (pivot is the largest element) and in the limit N → ∞ it can be ignored.

Actually 1 / N doesn't affect the analysis at all.

              1   2  N-1
C_N = N + 1 - ─ + ─   ∑  C_i
              N   N  i=0

After multiplying by N

                          N-1
N C_N = N (N + 1) - 1 + 2  ∑  C_i
                          i=0

and differencing

N C_N - (N - 1) C_{N - 1} = 2 N + 2 C_{N - 1}

1 / N goes away.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.