Recitation 5
Recitation notes for the week of September 25, 2017.
Overview
This week we will be reviewing the support page on matrix-vector multiplication. Indeed, much of the text in these notes are from that support page. In particular, we will go over:
- Definition of matrices and vectors.
- Example of a matrix-vector multiplication (time $O(n^2)$).
- Inner product
- Structured matrices
- Distributive law
What is a Matrix?
In mathematics, a vector is simply an ordered list of values. A matrix represents a vector of vectors, or a 2-d vector. From the support page:
Matrices
A matrix $\mathbf{A}$ with $m$ rows and $n$ columns (also denoted as an $m\times n$ matrix) will in code be defined as int [][] A = new int[m][n]
(assuming the matrix stores integers). Also a vector $\mathbf{x}$ of size $n$ in code will be declared as int [] x = new int[n]
(again assuming the vector contains integers).
Note
For the programming question in HW 3, you are dealing with square matrices, i.e. $n=m$.
Matrix-Vector Multiplication
Given an $n\times n$ matrix $\mathbf{A}$ and a vector $\mathbf{x}$ of length $n$, their product is denoted by \[\mathbf{y} =\mathbf{A}\cdot \mathbf{x},\] where $\mathbf{y}$ is also a vector of length $n$ and its $i$th entry for $0\le i\lt n$ is defined as follows: \[y_i =\sum_{j=0}^{n-1} A[i][j]\cdot x[j].\]
The Inner Product
It turns out that to visualize the matrix-vector multiplication, it is helpful to define a simpler operation between two vectors:
Inner Product
Given two vectors $\mathbf{x}$ and $\mathbf{y}$, both of length $n$, their inner product is defined as1 \[\left\langle \mathbf{x},\mathbf{y}\right\rangle = \sum_{i=0}^{n-1} x[i]\cdot y[i].\]
Armed with the definition of the inner product, here is an equivalent definition of the matrix-vector multiplication problem:
Matrix-vector multiplication (redefined)
Given an $n\times n$ matrix $\mathbf{A}$ and a vector $\mathbf{x}$ of length $n$, their product $\mathbf{y}=\mathbf{A}\cdot\mathbf{x}$ can also be defined as follows. Its $i$th entry for $0\le i\lt n$ is defined as follows: \[y_i =\left\langle \mathbf{A}[i],\mathbf{x}\right\rangle,\] where $\mathbf{A}[i]$ denotes the $i$th row of $\mathbf{A}$.
Below is one example of a matrix vector multiplication: \[ \begin{pmatrix} 1 & 2 & -3\\ 2 & 9 &0\\ 6 & -1 & -2 \end{pmatrix} \times \begin{pmatrix} 2\\ 3\\ -1 \end{pmatrix} = \begin{pmatrix} 1\times 2 + 2\times 3 + (-3)\times(-1)\\ 2\times 2+ 9\times 3+0\times (-1)\\ 6\times 2+(-1)\times 3+ (-2)\times(-1) \end{pmatrix} = \begin{pmatrix} 11\\ 31\\ 11 \end{pmatrix}. \]
Time Complexity
The time complexity of a regular multiplication is $O(n\cdot m)$, or $O(n^2)$ for square matrices. You can think of it as being two nested for loops. One for each row, and the inner one for each column. Here is a pseudocode example of multiplying matrix $\mathbf{A}$ with vector $\mathbf{x}$:
Algorithm Details
Allocate an array of length n and call it y
#Initialize the array
for i= 0 .. n-1
y[i]=0
#Do the actual matrix-vector multiplication
for i = 0 .. n-1
for j= 0 .. n-1
y[i] += A[i][j] * x[j]
return y
In-recitation Exercise
Run the above algorithm on the matrix-vector multiplication example from last section. Also argue this is an $O(n^2)$ algorithm.
Structured Matrices
If we know what the structure of a matrix is, we can often cut down on our algorithm time. We will consider one such example below and all Questions on HW 3 are based on this theme.
Outer Product
Outer Product
Given two vectors $\mathbf{s}$ and $\mathbf{t}$, their outer product is defined as the $n\times n$ matrix $\mathbf{A}$ such that for any entry $(i,j)$, we have \[A[i][j]=s[i]\cdot t[j].\]
In-recitation Exercise
Show the computation of the outer product of $\mathbf{s}=[4, 3, 2]$ and $\mathbf{t}=[-3, 2, 1]$.
Next, we present an algorithm to multiply the outer product of two arbitrary vectors $\mathbf{s}$ and $\mathbf{t}$ with a vector $\mathbf{x}$ in time $O(n)$. (Note that this is a significant improvement over the runtime from the trivial algorithm above.) In other words, we want to solve the following problem
Multiplying outer product with a vector
Given three vectors $\mathbf{s},\mathbf{t}$ and $\mathbf{x}$, compute the product of the outer product of $\mathbf{s}$ and $\mathbf{t}$ with the vector $\mathbf{x}$. If $\mathbf{y}$ is the product, then it is defined as (for every $0\le i\lt n$): \[y_i=\sum_{j=0}^{n-1} s[i]\cdot t[j]\cdot x[j].\]
Algorithm Idea
Observe that \[y_i=\sum_{j=0}^{n-1} s[i]\cdot t[j]\cdot x[j]=s[i]\cdot \sum_{j=0}^{n-1} t[j]\cdot x[j] = s[i]\left\langle \mathbf{t},\mathbf{x}\right\rangle.\] In other words, once we have computed $\left\langle \mathbf{t},\mathbf{x}\right\rangle$, one can compute $y_i$ by just multiplying the inner product with $s[i]$, which takes $O(1)$ time for each $0\le i\lt n$.
In-recitation Exercise
Do a run of the above algorithm on $\mathbf{s}=[4, 3, 2]$, $\mathbf{t}=[-3, 2, 1]$ and $\mathbf{x}=[3, 2, -2]$.
In-recitation Exercise
Compare the answer from the previous exercise to the answer you get when using our original algorithm with the outer product matrix.
In-recitation Exercise
The runtime for this algorithm is $\Theta(n)$? Why? At least go over the $O(n)$ argument.
The Distributive Law
In this section, we will present a simple observation that turns out to be very useful in designing efficient algorithms (especially those involving addition and multiplication of numbers).
We begin with the statement of the distributive law:
Distributive Law
Given three numbers $a,b,c$ we have4 \[a\cdot b+a\cdot c = a\cdot(b+c).\]
Note that the above implies that \[\sum_{i=0}^{n-1} a\cdot b_i = a\cdot\left(\sum_{i=0}^{n-1} b_i\right).\] With our above outer product example, this allowed us to store values and not have to recalculate them each time.
Hint
This will be very useful for Question 3 on HW 3.
Further Reading
- Make sure you carefully read the policy documents: Syllabus and Homework policies.
- Support page on matrix-vector multiplication.