最近开始学习Coursera上普林斯顿的算法分析课程,后续会记录下学习过程,这次是第一周的内容。

课程主页:https://www.coursera.org/learn/analysis-of-algorithms

Exercise

1.14

Follow through the steps given for quicksort to solve the recurrence:

解:两边乘以$N$可得

相减可得

两边同除$N(N+1)$得到

所以

1.15

Show that the average number of exchanges used during the first partitioning stage (before the pointers cross) is $(N-2)/6$.

首先回顾源代码:

public class Quick
{
    private static int partition(Comparable[] a, int lo, int hi)
    {
        int i = lo, j = hi+1;
        while (true)
        {
            while (less(a[++i], a[lo])) if (i == hi) break;
            while (less(a[lo], a[--j])) if (j == lo) break;
            if (i >= j) break;
            exch(a, i, j);
    	}
        exch(a, lo, j);
        return j;
    }
    private static void sort(Comparable[] a, int lo, int hi)
    {
        if (hi <= lo) return;
        int j = partition(a, lo, hi);
        sort(a, lo, j-1);
        sort(a, j+1, hi);
    }
}

题目的意思是计算exch(a, lo, j)之前的平均交换次数。

不妨设元素为$1,2,\ldots, N$,假设主元为$k$,那么交换次数为$a[1],\ldots , a[k-1]$中大于$k$的元素数量,考虑示性函数$X_i$,其中

那么在主元为$k$的条件下,平均交换次数为

因为$k$可以可以等概率的取$1,2,\ldots, N$中任意的数,所以平均交换次数为

1.17

If we change the first line in the quicksort implementation given in the lecture to call insertion sort when the subfile size is not greater than $M$ then the total number of comparisons to sort $N$ elements is described by the recurrence

Solve this recurrence.

对于$N > M$,递推式仍然为

递推可得

所以对于$N > M$

1.18

所以

不考虑常数项,作图可得

当M= 8 时取最小值

代码如下:

import numpy as np
import matplotlib.pyplot as plt

M = np.arange(1, 100)
d1 = M * (M - 1) / (4 * (M + 1))
d2 = 2 * np.cumsum(1 / M)
f = d1 - d2

plt.plot(M, f)
plt.show()

print("当M=", M[np.argmin(f)], "时取最小值")

Problem

1

相减可得

累加可得

2

Which of the following is a drawback to analyzing algorithms by proving O- upper bounds on the worst case?

  • Generally cannot be used to predict performance.
  • Likely to be too much detail in the analysis.
  • Generally cannot be used to compare algorithms.
  • Input model may be unrealistic.

显然答案如下:

Generally cannot be used to predict performance.

Generally cannot be used to compare algorithms.