4.2 Strassen's algorithm for matrix multiplication
4.2-1
Step 1, we partition each of matrix A, B into four $\frac{n}{2} \text{ * } \frac{n}{2}$ matrices. So we have:
$$ A = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \\ \end{pmatrix}, A_{11} = \begin{pmatrix} 1 \end{pmatrix}, A_{12} = \begin{pmatrix} 3 \end{pmatrix}, A_{21} = \begin{pmatrix} 7 \end{pmatrix}, A_{22} = \begin{pmatrix} 5 \end{pmatrix} $$
$$ B = \begin{pmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \\ \end{pmatrix}, B_{11} = \begin{pmatrix} 6 \end{pmatrix}, B_{12} = \begin{pmatrix} 8 \end{pmatrix}, B_{21} = \begin{pmatrix} 4 \end{pmatrix}, B_{22} = \begin{pmatrix} 2 \end{pmatrix} $$
Step 2, we create 10 matrices:
$$S_1 = B_{12} - B_{22} = \begin{pmatrix} 6 \end{pmatrix}$$ $$S_2 = A_{11} + A_{12} = \begin{pmatrix} 4 \end{pmatrix}$$ $$S_3 = A_{21} + A_{22} = \begin{pmatrix} 12 \end{pmatrix}$$ $$S_4 = B_{21} - B_{11} = \begin{pmatrix} -2 \end{pmatrix}$$ $$S_5 = A_{11} + A_{22} = \begin{pmatrix} 6 \end{pmatrix}$$ $$S_6 = B_{11} + B_{22} = \begin{pmatrix} 8 \end{pmatrix}$$ $$S_7 = A_{12} - A_{22} = \begin{pmatrix} -2 \end{pmatrix}$$ $$S_8 = B_{21} + B_{22} = \begin{pmatrix} 6 \end{pmatrix}$$ $$S_9 = A_{11} - A_{21} = \begin{pmatrix} -6 \end{pmatrix}$$ $$S_{10} = B_{11} + B_{12} = \begin{pmatrix} 14 \end{pmatrix}$$
Step 3, we compute the seven matrix products:
$$P_1 = A_{11} * S_{1} = \begin{pmatrix} 6 \end{pmatrix}$$ $$P_2 = S_{2} * B_{22} = \begin{pmatrix} 8 \end{pmatrix}$$ $$P_3 = S_{3} * B_{11} = \begin{pmatrix} 72 \end{pmatrix}$$ $$P_4 = A_{22} * S_{4} = \begin{pmatrix} -10 \end{pmatrix}$$ $$P_5 = S_{5} * S_{6} = \begin{pmatrix} 48 \end{pmatrix}$$ $$P_6 = S_{7} * S_{8} = \begin{pmatrix} -12 \end{pmatrix}$$ $$P_7 = S_{9} * S_{10} = \begin{pmatrix} -84 \end{pmatrix}$$
Step 4, compute the desired submatrices:
$$C_{11} = P_5 + P_4 - P_2 + P_6 = \begin{pmatrix} 18 \end{pmatrix}$$ $$C_{12} = P_1 + P_2 = \begin{pmatrix} 14 \end{pmatrix}$$ $$C_{21} = P_3 + P_4 = \begin{pmatrix} 62 \end{pmatrix}$$ $$C_{22} = P_5 + P_1 - P_3 - P_7 = \begin{pmatrix} 66 \end{pmatrix}$$
So $C = \begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \\ \end{pmatrix} = \begin{pmatrix} 18 & 14 \\ 62 & 66 \\ \end{pmatrix}$.
4.2-2
SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A, B)
n = A.rows
let C be a new n * n matrix
if n == 1
C11 = A11 * B11
else partition A, B, and C as in equations (4.9)
S1 = B12 - B22
S2 = A11 + A12
S3 = A21 + A22
S4 = B21 - B11
S5 = A11 + A22
S6 = B11 + B22
S7 = A12 - A22
S8 = B21 + B22
S9 = A11 - A21
S10 = B11 + B12
P1 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A11, S1)
P2 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S2, B22)
P3 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S3, B11)
P4 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A22, S4)
P5 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S5, S6)
P6 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S7, S8)
P7 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S9, S10)
C11 = P5 + P4 - P2 + P6
C12 = P1 + P2
C21 = P3 + P4
C22 = P5 + P1 - P3 - P7
return C
def square_matrix_multiply_strassen_algorithm(a, b):
n = len(a)
c = [[0] * n for i in range(n)]
if n == 1:
c[0][0] = a[0][0] * b[0][0]
else:
half = n // 2
a11 = [[0] * half for i in range(half)]
a12 = [[0] * half for i in range(half)]
a21 = [[0] * half for i in range(half)]
a22 = [[0] * half for i in range(half)]
b11 = [[0] * half for i in range(half)]
b12 = [[0] * half for i in range(half)]
b21 = [[0] * half for i in range(half)]
b22 = [[0] * half for i in range(half)]
for i in range(half):
for j in range(half):
a11[i][j] = a[i][j]
a12[i][j] = a[i][j + half]
a21[i][j] = a[i + half][j]
a22[i][j] = a[i + half][j + half]
b11[i][j] = b[i][j]
b12[i][j] = b[i][j + half]
b21[i][j] = b[i + half][j]
b22[i][j] = b[i + half][j + half]
s1 = subtract(b12, b22)
s2 = add(a11, a12)
s3 = add(a21, a22)
s4 = subtract(b21, b11)
s5 = add(a11, a22)
s6 = add(b11, b22)
s7 = subtract(a12, a22)
s8 = add(b21, b22)
s9 = subtract(a11, a21)
s10 = add(b11, b12)
p1 = square_matrix_multiply_strassen_algorithm(a11, s1)
p2 = square_matrix_multiply_strassen_algorithm(s2, b22)
p3 = square_matrix_multiply_strassen_algorithm(s3, b11)
p4 = square_matrix_multiply_strassen_algorithm(a22, s4)
p5 = square_matrix_multiply_strassen_algorithm(s5, s6)
p6 = square_matrix_multiply_strassen_algorithm(s7, s8)
p7 = square_matrix_multiply_strassen_algorithm(s9, s10)
c11 = add(subtract(add(p5, p4), p2), p6)
c12 = add(p1, p2)
c21 = add(p3, p4)
c22 = subtract(subtract(add(p5, p1), p3), p7)
for i in range(half):
for j in range(half):
c[i][j] = c11[i][j]
c[i][j + half] = c12[i][j]
c[i + half][j] = c21[i][j]
c[i + half][j + half] = c22[i][j]
return c
def add(a, b):
n = len(a)
c = [[0] * n for i in range(n)]
for i in range(n):
for j in range(n):
c[i][j] = a[i][j] + b[i][j]
return c
def subtract(a, b):
n = len(a)
c = [[0] * n for i in range(n)]
for i in range(n):
for j in range(n):
c[i][j] = a[i][j] - b[i][j]
return c
4.2-3
If n is not an exact power of 2, then we can pad 0 to matrix to make it an exact power of 2. Let $k = \lfloor \lg{n} \rfloor$, Let $m = 2^{k + 1}$, so we have a new m * m matrix. Thus, $T(m) = 7T(\frac{m}{2}) + \Theta(m^2)$. According to master method, we have $T(m) = \Theta(m^{\lg7})$. So there exist postive constants $c_1$, $c_2$ and $n_0$ such that $c_1m^{\lg7} \leq T(m) \leq c_2m^{\lg7} \text{ for all } n \geq n_0$.
We have $m^{\lg7} = (2^{\lfloor \lg{n} \rfloor + 1})^{\lg7} \leq (2^{\lg{n} + 1})^{\lg7} = 7n^{\lg7}$. And $m^{\lg7} = (2^{\lfloor \lg{n} \rfloor + 1})^{\lg7} \geq (2^{\lg{n} - 1 + 1})^{\lg7} = n^{\lg7}$. So $min(c_1, 1)n^{\lg7} \leq T(m) \leq max(c_2, 7)n^{\lg7} \text{ for all } n \geq n_0$. Thus $T(m) = \Theta(n^{\lg7})$, the resulting algorithm still runs in time $\Theta(n^{\lg7})$.
4.2-4
We have $T(n) = kT(\frac{n}{3}) + \Theta(n^2)$ with $b = 3, a = k$. If we want to run the algorithm in time $o(n^{\lg7})$, then case 3 cannot apply, since $f(n) = \Omega(n^{\lg7})$.
If case 2 applies, then $f(n) = \Theta(n^{\log_3k})$, so $\log_3k = 2$, but the algorithm runs in time $\Theta(n^2\lg{n})$.
So case 1 applies, the algorithm runs in time $\Theta(n^{\log_3k}) = \Theta(n^{\frac{\lg{k}}{\lg3}})$.
If we want the algorithm to runs in time $o(n^{\lg7})$, then we have $\log_3k < \lg7, k < 3^{\lg7}$, so the largest k is 21.
4.2-5
We know the best running time of multiplying b * b matrices using a multiplications is $\Theta(n^{\log_ba})$. So $\log_68{132464} = 2.7951284873613815$, $\log_70{143640} = 2.795122689748337$, $\log_72{155424} = 2.795147391093449$. So multiplying 72 * 72 matrices using 155,424 multiplications yields the best asymptotic running time.
Since $\lg7 = 2.807354922057604$, so the new algorithm is faster than Strassen's algorithm.
4.2-6
We can partition A, B into k n * n matrices, and partition C into $k^2$ n * n matrices:
$$ A = \begin{pmatrix} A_{11} \\ A_{21} \\ \ldots \\ A_{k1} \end{pmatrix}, B = \begin{pmatrix} B_{11} & B_{12} & \ldots & B_{1k} \end{pmatrix}, C = \begin{pmatrix} C_{11} & C_{12} & \ldots & C_{1k} \\ \ldots \\ C_{k1} & C_{k2} & \ldots & C_{kk} \end{pmatrix} $$
So that we rewrite the equation C = A * B as:
$$ \begin{pmatrix} C_{11} & C_{12} & \ldots & C_{1k} \\ \ldots \\ C_{k1} & C_{k2} & \ldots & C_{kk} \end{pmatrix} = \begin{pmatrix} A_{11} \\ A_{21} \\ \ldots \\ A_{k1} \end{pmatrix} * \begin{pmatrix} B_{11} & B_{12} & \ldots & B_{1k} \end{pmatrix}, C_{ij} = A_{i1} * B_{1j} $$
Then we can use Strassen's algorithm as a subroutine to calculate $C_{ij}$.
let C be a new kn * kn matrix
for i = 1 to k
for j = 1 to k
Cij = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(Ai1 * B1j)
return C
If the input matrices are reversed, we can partition A, B into k n * n matrices:
$$ A = \begin{pmatrix} A_{11} & A_{12} & \ldots & A_{1k} \end{pmatrix}, B = \begin{pmatrix} B_{11} \\ B_{21} \\ \ldots \\ B_{k1} \end{pmatrix} $$
So that we rewrite the equation C = A * B as:
$$ C = \begin{pmatrix} A_{11} & A_{12} & \ldots & A_{1k} \end{pmatrix} * \begin{pmatrix} B_{11} \\ B_{21} \\ \ldots \\ B_{k1} \end{pmatrix} $$
Then we can use Strassen's algorithm as a subroutine to calculate C.
let C be a new n * n matrix
for i = 1 to k
C11 += SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A1i * Bi1)
return C
4.2-7
MULTIPLY-COMPLEX-NUMBERS(a, b, c, d)
n1 = (a + b) * (c + d)
n2 = a * c
n3 = b * d
return (n2 - n3, n1 - n2 - n3)