2.2 Analyzing algorithms

2.2-1

$\Theta(n^3)$.

2.2-2

Pseudocode:

SELECTION-SORT (A)

1 for i = 1 to A.length - 1
2     min_index = i
3
4     for j = i + 1 to A.length
5         if A[min_index] > A[j]
6             min_index = j
7
8     if i != min_index:
9         temp = A[i]
10        A[i] = A[min_index]
11        A[min_index] = temp

Loop invariant:

At the start of each iteration of the outer for loop, the subarray A[1..i - 1] consists of elements that are in sorted order.

Why does it need to run for only the first n - 1 elements, rather than for all n elements?

When i equals to A.length - 1, only the last two elements may not be in sorted order, and we need to find the A.length - 1 smallest element, so we compare A[A.length - 1] and A[A.length]. We put the smaller element at the index of A.length - 1, and put the larger element at the index of A.length. So after this iteration, the last element of the array is already the largest number. So it's not necessary to run for the nth element.

Give the best-case and worst-case running times of selection sort in $\Theta$ notation.

Let's first see the cost of each statement and the number of times each statement is executed. We let $t_i$ denote the number of times the operation in line 6 is executed for that value of i, and let $p_i$ denote the number of times the operation in line 9/10/11 is executed for that value of i.

Line number Cost Times
1 $c_1$ $n$
2 $c_2$ $n - 1$
4 $c_4$ $\sum_{i = 1}^{n - 1} (n - i + 1) = \frac{n(n + 1)}{2} - 1$
5 $c_5$ $\sum_{i = 1}^{n - 1} (n - i) = \frac{n(n - 1)}{2}$
6 $c_6$ $\sum_{i = 1}^{n - 1} t_i$
8 $c_8$ $n - 1$
9 $c_9$ $\sum_{i = 1}^{n - 1} p_i$
10 $c_{10}$ $\sum_{i = 1}^{n - 1} p_i$
11 $c_{11}$ $\sum_{i = 1}^{n - 1} p_i$

So the running time of selection sort is:

$$T(n) = c_1n + c_2(n - 1) + c_4(\frac{n(n + 1)}{2} - 1) + c_5\frac{n(n - 1)}{2} + c_6\sum_{i = 1}^{n - 1} t_i + c_8(n - 1) + \\\ (c_9 + c_{10} + c_{11})\sum_{i = 1}^{n - 1} p_i$$

Now let's see the best-case. The best-case occurs if the array is already sorted. Thus $t_i$ and $p_i$ are both 0, and the best-case running time is:

$$ \begin{eqnarray} T(n) &=& c_1n + c_2(n - 1) + c_4(\frac{n(n + 1)}{2} - 1) + c_5\frac{n(n - 1)}{2} + c_8(n - 1) \\\ &=& (\frac{c_4}{2} + \frac{c_5}{2})n^2 + (c_1 + c_2 + \frac{c_4}{2} - \frac{c_5}{2} + c_8)n - (c_2 + c_4 + c_8) \\\ &=& \Theta(n^2) \end{eqnarray} $$

And the worst-case occurs if the array is in reverse sorted order. Thus in the iteration of j, line 6 is executed from i + 1 to A.length - (i - 1) (because after the first iteration of i, the largest element is at the index of A.length, after the second iteration of i, the second largest element is at the index of A.length - 1, so the last i - 1 elements are bigger than the previous elements), so $t_i = n - i - (i - 1) = n - 2i + 1$. So the times of line 6 is $c_6\sum_{i = 1}^{n - 1} (n - 2i + 1)$, notice that $n - 2i + 1$ will be 0, when it's 0, it's not necessary to sum it, to make it simple, let's assume it stops when $i = \frac{n}{2}$. So:

$$ \begin{eqnarray} \sum_{i = 1}^{n - 1} (n - 2i + 1) &=& (n - 1) + (n - 3) + \ldots + 1 \\\ &=& n * \frac{n}{2} - (1 + 3 + 5 + \ldots + (2 * \frac{n}{2} - 1)) \\\ &=& \frac{n^2}{2} - \frac{\frac{n}{2}(1 + n - 1)}{2} \\\ &=& \frac{n^2}{4} \end{eqnarray} $$

Now let's see $p_i$, we already assume that line 6 will not be executed when $i \geq \frac{n}{2}$, which also means line 9 to line 11 will not be executed. So $\sum_{i = 1}^{n - 1} p_i$ should be $\frac{n}{2}$. So the running time of worst-case is:

$$ \begin{eqnarray} T(n) &=& c_1n + c_2(n - 1) + c_4(\frac{n(n + 1)}{2} - 1) + c_5\frac{n(n - 1)}{2} + c_6\frac{n^2}{4} + c_8(n - 1) + \frac{c_9 + c_{10} + c_{11}}{2}n \\\ &=& (\frac{c_4}{2} + \frac{c_5}{2} + \frac{c_6}{4})n^2 + (c_1 + c_2 + \frac{c_4}{2} - \frac{c_5}{2} + c_8 + \frac{c_9 + c_{10} + c_{11}}{2})n - (c_2 + c_4 + c_8) \\\ &=& \Theta(n^2) \end{eqnarray} $$

2.2-3

How many elements of the input sequence need to be checked on the average?

Since the element being searched for is equally likely to be any element in the array, so the probability of finding the target value at index i is $1 / n$, and it will check i elements. So the average checks is:

$$\frac{1}{n}(1 + 2 + 3 + \ldots + n - 1 + n) = \frac{1}{n}\frac{n(n + 1)}{2}=\frac{n + 1}{2}$$

And the worst case will check n elements. The running time of average case and worst case are both $\Theta(n)$.

2.2-4

We can check if the input is a best-case first, if it is true, then we can return a predefined solution for that best-case. For example, in a sorting algorithm, if the input is sorted, then we can return the array directly.