4.2 Strassen's algorithm for matrix multiplication

4.2-1

Step 1, we partition each of matrix A, B into four $\frac{n}{2} \text{ * } \frac{n}{2}$ matrices. So we have:

$$ A = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \\ \end{pmatrix}, A_{11} = \begin{pmatrix} 1 \end{pmatrix}, A_{12} = \begin{pmatrix} 3 \end{pmatrix}, A_{21} = \begin{pmatrix} 7 \end{pmatrix}, A_{22} = \begin{pmatrix} 5 \end{pmatrix} $$

$$ B = \begin{pmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \\ \end{pmatrix}, B_{11} = \begin{pmatrix} 6 \end{pmatrix}, B_{12} = \begin{pmatrix} 8 \end{pmatrix}, B_{21} = \begin{pmatrix} 4 \end{pmatrix}, B_{22} = \begin{pmatrix} 2 \end{pmatrix} $$

Step 2, we create 10 matrices:

$$S_1 = B_{12} - B_{22} = \begin{pmatrix} 6 \end{pmatrix}$$ $$S_2 = A_{11} + A_{12} = \begin{pmatrix} 4 \end{pmatrix}$$ $$S_3 = A_{21} + A_{22} = \begin{pmatrix} 12 \end{pmatrix}$$ $$S_4 = B_{21} - B_{11} = \begin{pmatrix} -2 \end{pmatrix}$$ $$S_5 = A_{11} + A_{22} = \begin{pmatrix} 6 \end{pmatrix}$$ $$S_6 = B_{11} + B_{22} = \begin{pmatrix} 8 \end{pmatrix}$$ $$S_7 = A_{12} - A_{22} = \begin{pmatrix} -2 \end{pmatrix}$$ $$S_8 = B_{21} + B_{22} = \begin{pmatrix} 6 \end{pmatrix}$$ $$S_9 = A_{11} - A_{21} = \begin{pmatrix} -6 \end{pmatrix}$$ $$S_{10} = B_{11} + B_{12} = \begin{pmatrix} 14 \end{pmatrix}$$

Step 3, we compute the seven matrix products:

$$P_1 = A_{11} * S_{1} = \begin{pmatrix} 6 \end{pmatrix}$$ $$P_2 = S_{2} * B_{22} = \begin{pmatrix} 8 \end{pmatrix}$$ $$P_3 = S_{3} * B_{11} = \begin{pmatrix} 72 \end{pmatrix}$$ $$P_4 = A_{22} * S_{4} = \begin{pmatrix} -10 \end{pmatrix}$$ $$P_5 = S_{5} * S_{6} = \begin{pmatrix} 48 \end{pmatrix}$$ $$P_6 = S_{7} * S_{8} = \begin{pmatrix} -12 \end{pmatrix}$$ $$P_7 = S_{9} * S_{10} = \begin{pmatrix} -84 \end{pmatrix}$$

Step 4, compute the desired submatrices:

$$C_{11} = P_5 + P_4 - P_2 + P_6 = \begin{pmatrix} 18 \end{pmatrix}$$ $$C_{12} = P_1 + P_2 = \begin{pmatrix} 14 \end{pmatrix}$$ $$C_{21} = P_3 + P_4 = \begin{pmatrix} 62 \end{pmatrix}$$ $$C_{22} = P_5 + P_1 - P_3 - P_7 = \begin{pmatrix} 66 \end{pmatrix}$$

So $C = \begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \\ \end{pmatrix} = \begin{pmatrix} 18 & 14 \\ 62 & 66 \\ \end{pmatrix}$.

4.2-2

SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A, B)

n = A.rows
let C be a new n * n matrix
if n == 1
    C11 = A11 * B11
else partition A, B, and C as in equations (4.9)
    S1 = B12 - B22
    S2 = A11 + A12
    S3 = A21 + A22
    S4 = B21 - B11
    S5 = A11 + A22
    S6 = B11 + B22
    S7 = A12 - A22
    S8 = B21 + B22
    S9 = A11 - A21
    S10 = B11 + B12

    P1 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A11, S1)
    P2 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S2, B22)
    P3 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S3, B11)
    P4 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A22, S4)
    P5 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S5, S6)
    P6 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S7, S8)
    P7 = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(S9, S10)

    C11 = P5 + P4 - P2 + P6
    C12 = P1 + P2
    C21 = P3 + P4
    C22 = P5 + P1 - P3 - P7
return C
def square_matrix_multiply_strassen_algorithm(a, b):
    n = len(a)
    c = [[0] * n for i in range(n)]

    if n == 1:
        c[0][0] = a[0][0] * b[0][0]
    else:
        half = n // 2

        a11 = [[0] * half for i in range(half)]
        a12 = [[0] * half for i in range(half)]
        a21 = [[0] * half for i in range(half)]
        a22 = [[0] * half for i in range(half)]
        b11 = [[0] * half for i in range(half)]
        b12 = [[0] * half for i in range(half)]
        b21 = [[0] * half for i in range(half)]
        b22 = [[0] * half for i in range(half)]

        for i in range(half):
            for j in range(half):
                a11[i][j] = a[i][j]
                a12[i][j] = a[i][j + half]
                a21[i][j] = a[i + half][j]
                a22[i][j] = a[i + half][j + half]
                b11[i][j] = b[i][j]
                b12[i][j] = b[i][j + half]
                b21[i][j] = b[i + half][j]
                b22[i][j] = b[i + half][j + half]

        s1 = subtract(b12, b22)
        s2 = add(a11, a12)
        s3 = add(a21, a22)
        s4 = subtract(b21, b11)
        s5 = add(a11, a22)
        s6 = add(b11, b22)
        s7 = subtract(a12, a22)
        s8 = add(b21, b22)
        s9 = subtract(a11, a21)
        s10 = add(b11, b12)

        p1 = square_matrix_multiply_strassen_algorithm(a11, s1)
        p2 = square_matrix_multiply_strassen_algorithm(s2, b22)
        p3 = square_matrix_multiply_strassen_algorithm(s3, b11)
        p4 = square_matrix_multiply_strassen_algorithm(a22, s4)
        p5 = square_matrix_multiply_strassen_algorithm(s5, s6)
        p6 = square_matrix_multiply_strassen_algorithm(s7, s8)
        p7 = square_matrix_multiply_strassen_algorithm(s9, s10)

        c11 = add(subtract(add(p5, p4), p2), p6)
        c12 = add(p1, p2)
        c21 = add(p3, p4)
        c22 = subtract(subtract(add(p5, p1), p3), p7)

        for i in range(half):
            for j in range(half):
                c[i][j] = c11[i][j]
                c[i][j + half] = c12[i][j]
                c[i + half][j] = c21[i][j]
                c[i + half][j + half] = c22[i][j]

    return c


def add(a, b):
    n = len(a)
    c = [[0] * n for i in range(n)]

    for i in range(n):
        for j in range(n):
            c[i][j] = a[i][j] + b[i][j]

    return c


def subtract(a, b):
    n = len(a)
    c = [[0] * n for i in range(n)]

    for i in range(n):
        for j in range(n):
            c[i][j] = a[i][j] - b[i][j]

    return c

4.2-3

If n is not an exact power of 2, then we can pad 0 to matrix to make it an exact power of 2. Let $k = \lfloor \lg{n} \rfloor$, Let $m = 2^{k + 1}$, so we have a new m * m matrix. Thus, $T(m) = 7T(\frac{m}{2}) + \Theta(m^2)$. According to master method, we have $T(m) = \Theta(m^{\lg7})$. So there exist postive constants $c_1$, $c_2$ and $n_0$ such that $c_1m^{\lg7} \leq T(m) \leq c_2m^{\lg7} \text{ for all } n \geq n_0$.

We have $m^{\lg7} = (2^{\lfloor \lg{n} \rfloor + 1})^{\lg7} \leq (2^{\lg{n} + 1})^{\lg7} = 7n^{\lg7}$. And $m^{\lg7} = (2^{\lfloor \lg{n} \rfloor + 1})^{\lg7} \geq (2^{\lg{n} - 1 + 1})^{\lg7} = n^{\lg7}$. So $min(c_1, 1)n^{\lg7} \leq T(m) \leq max(c_2, 7)n^{\lg7} \text{ for all } n \geq n_0$. Thus $T(m) = \Theta(n^{\lg7})$, the resulting algorithm still runs in time $\Theta(n^{\lg7})$.

4.2-4

We have $T(n) = kT(\frac{n}{3}) + \Theta(n^2)$ with $b = 3, a = k$. If we want to run the algorithm in time $o(n^{\lg7})$, then case 3 cannot apply, since $f(n) = \Omega(n^{\lg7})$.

If case 2 applies, then $f(n) = \Theta(n^{\log_3k})$, so $\log_3k = 2$, but the algorithm runs in time $\Theta(n^2\lg{n})$.

So case 1 applies, the algorithm runs in time $\Theta(n^{\log_3k}) = \Theta(n^{\frac{\lg{k}}{\lg3}})$.

If we want the algorithm to runs in time $o(n^{\lg7})$, then we have $\log_3k < \lg7, k < 3^{\lg7}$, so the largest k is 21.

4.2-5

We know the best running time of multiplying b * b matrices using a multiplications is $\Theta(n^{\log_ba})$. So $\log_68{132464} = 2.7951284873613815$, $\log_70{143640} = 2.795122689748337$, $\log_72{155424} = 2.795147391093449$. So multiplying 72 * 72 matrices using 155,424 multiplications yields the best asymptotic running time.

Since $\lg7 = 2.807354922057604$, so the new algorithm is faster than Strassen's algorithm.

4.2-6

We can partition A, B into k n * n matrices, and partition C into $k^2$ n * n matrices:

$$ A = \begin{pmatrix} A_{11} \\ A_{21} \\ \ldots \\ A_{k1} \end{pmatrix}, B = \begin{pmatrix} B_{11} & B_{12} & \ldots & B_{1k} \end{pmatrix}, C = \begin{pmatrix} C_{11} & C_{12} & \ldots & C_{1k} \\ \ldots \\ C_{k1} & C_{k2} & \ldots & C_{kk} \end{pmatrix} $$

So that we rewrite the equation C = A * B as:

$$ \begin{pmatrix} C_{11} & C_{12} & \ldots & C_{1k} \\ \ldots \\ C_{k1} & C_{k2} & \ldots & C_{kk} \end{pmatrix} = \begin{pmatrix} A_{11} \\ A_{21} \\ \ldots \\ A_{k1} \end{pmatrix} * \begin{pmatrix} B_{11} & B_{12} & \ldots & B_{1k} \end{pmatrix}, C_{ij} = A_{i1} * B_{1j} $$

Then we can use Strassen's algorithm as a subroutine to calculate $C_{ij}$.

let C be a new kn * kn matrix
for i = 1 to k
    for j = 1 to k
        Cij = SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(Ai1 * B1j)
return C

If the input matrices are reversed, we can partition A, B into k n * n matrices:

$$ A = \begin{pmatrix} A_{11} & A_{12} & \ldots & A_{1k} \end{pmatrix}, B = \begin{pmatrix} B_{11} \\ B_{21} \\ \ldots \\ B_{k1} \end{pmatrix} $$

So that we rewrite the equation C = A * B as:

$$ C = \begin{pmatrix} A_{11} & A_{12} & \ldots & A_{1k} \end{pmatrix} * \begin{pmatrix} B_{11} \\ B_{21} \\ \ldots \\ B_{k1} \end{pmatrix} $$

Then we can use Strassen's algorithm as a subroutine to calculate C.

let C be a new n * n matrix
for i = 1 to k
    C11 += SQUARE-MATRIX-MULTIPLY-STRASSEN-ALGORITHM(A1i * Bi1)
return C

4.2-7

MULTIPLY-COMPLEX-NUMBERS(a, b, c, d)

n1 = (a + b) * (c + d)
n2 = a * c
n3 = b * d
return (n2 - n3, n1 - n2 - n3)