An m × n m \times n m × n matrix is a rectangular array of numbers with m m m rows and n n n columns:
A = [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ] A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} A = a 11 a 21 ⋮ a m 1 a 12 a 22 ⋮ a m 2 ⋯ ⋯ ⋱ ⋯ a 1 n a 2 n ⋮ a mn
Notation:
A A A is an m × n m \times n m × n matrix means A A A has m m m rows and n n n columns
We usually denote the entry in the row i i i , column j j j as a i j a_{ij} a ij
We can write a matrix A A A in terms of its columns:
A = [ a ⃗ 1 a ⃗ 2 ⋯ a ⃗ n ] , A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \cdots & \vec{a}_n \end{bmatrix}, A = [ a 1 a 2 ⋯ a n ] ,
where each a ⃗ j \vec{a}_j a j is a column vector in R m \mathbb{R}^m R m .
Matrix-Vector Multiplication
Definition (Matrix-Vector Multiplication):
If A A A is an m × n m \times n m × n matrix with columns a ⃗ 1 , a ⃗ 2 , … , a ⃗ n \vec{a}_1, \vec{a}_2, \ldots, \vec{a}_n a 1 , a 2 , … , a n and x ⃗ = ( x 1 , x 2 , … , x n ) \vec{x} = (x_1, x_2,\ldots, x_n) x = ( x 1 , x 2 , … , x n ) is a vector in R n \mathbb{R}^n R n , then:
A x ⃗ = [ a ⃗ 1 a ⃗ 2 ⋯ a ⃗ n ] [ x 1 x 2 ⋮ x n ] = x 1 a ⃗ 1 + x 2 a ⃗ 2 + ⋯ + x n a ⃗ n A\vec{x} = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \cdots & \vec{a}_n \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = x_1\vec{a}_1 + x_2\vec{a}_2 + \cdots + x_n\vec{a}_n A x = [ a 1 a 2 ⋯ a n ] x 1 x 2 ⋮ x n = x 1 a 1 + x 2 a 2 + ⋯ + x n a n
The product A x ⃗ A\vec{x} A x is a linear combination of the columns of A A A with weights from x ⃗ \vec{x} x , and the resulting vector lies in R m \mathbb{R}^m R m .
▼ Example 1 Let A = [ 1 2 3 4 5 6 ] A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{bmatrix} A = 1 3 5 2 4 6 and x ⃗ = [ 2 3 ] \vec{x} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} x = [ 2 3 ] .
Compute A x ⃗ A\vec{x} A x :
A x ⃗ = 2 [ 1 3 5 ] + 3 [ 2 4 6 ] = [ 2 6 10 ] + [ 6 12 18 ] = [ 8 18 28 ] A\vec{x} = 2\begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix} + 3\begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix} = \begin{bmatrix} 2 \\ 6 \\ 10 \end{bmatrix} + \begin{bmatrix} 6 \\ 12 \\ 18 \end{bmatrix} = \begin{bmatrix} 8 \\ 18 \\ 28 \end{bmatrix} A x = 2 1 3 5 + 3 2 4 6 = 2 6 10 + 6 12 18 = 8 18 28
▼ Row-by-Row Method for Computation There's an equivalent way to compute A x ⃗ A\vec{x} A x by taking dot products of rows with x ⃗ \vec{x} x :
A x ⃗ = [ 1 2 3 4 5 6 ] [ 2 3 ] = [ 1 ( 2 ) + 2 ( 3 ) 3 ( 2 ) + 4 ( 3 ) 5 ( 2 ) + 6 ( 3 ) ] = [ 8 18 28 ] A\vec{x} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{bmatrix}\begin{bmatrix} 2 \\ 3 \end{bmatrix} = \begin{bmatrix} 1(2) + 2(3) \\ 3(2) + 4(3) \\ 5(2) + 6(3) \end{bmatrix} = \begin{bmatrix} 8 \\ 18 \\ 28 \end{bmatrix} A x = 1 3 5 2 4 6 [ 2 3 ] = 1 ( 2 ) + 2 ( 3 ) 3 ( 2 ) + 4 ( 3 ) 5 ( 2 ) + 6 ( 3 ) = 8 18 28
Both methods give the same result. The column perspective (linear combination) is more conceptually important for linear algebra, while the row perspective is sometimes more convenient for computation.
▼ Example 2: Identity Matrix Let I = [ 1 0 0 0 1 0 0 0 1 ] I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} I = 1 0 0 0 1 0 0 0 1 and x ⃗ = [ a b c ] \vec{x} = \begin{bmatrix} a \\ b \\ c \end{bmatrix} x = a b c .
I x ⃗ = a [ 1 0 0 ] + b [ 0 1 0 ] + c [ 0 0 1 ] = [ a b c ] = x ⃗ I\vec{x} = a\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + b\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + c\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} a \\ b \\ c \end{bmatrix} = \vec{x} I x = a 1 0 0 + b 0 1 0 + c 0 0 1 = a b c = x The identity matrix leaves any vector unchanged: I x ⃗ = x ⃗ I\vec{x} = \vec{x} I x = x .
Properties of Matrix-Vector Multiplication
Theorem (Linearity of Matrix-Vector Multiplication):
Matrix-vector multiplication satisfies the following two fundamental linearity properties :
A ( u ⃗ + v ⃗ ) = A u ⃗ + A v ⃗ A(\vec{u} + \vec{v}) = A\vec{u} + A\vec{v} A ( u + v ) = A u + A v
A ( c u ⃗ ) = c ( A u ⃗ ) A(c\vec{u}) = c(A\vec{u}) A ( c u ) = c ( A u )
▼ Proofs Let A = [ a ⃗ 1 a ⃗ 2 ⋯ a ⃗ n ] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \cdots & \vec{a}_n \end{bmatrix} A = [ a 1 a 2 ⋯ a n ] , u ⃗ = [ u 1 ⋮ u n ] \vec{u} = \begin{bmatrix} u_1 \\ \vdots \\ u_n \end{bmatrix} u = u 1 ⋮ u n , and v ⃗ = [ v 1 ⋮ v n ] \vec{v} = \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix} v = v 1 ⋮ v n .
Then u ⃗ + v ⃗ = [ u 1 + v 1 ⋮ u n + v n ] \vec{u} + \vec{v} = \begin{bmatrix} u_1 + v_1 \\ \vdots \\ u_n + v_n \end{bmatrix} u + v = u 1 + v 1 ⋮ u n + v n and c u ⃗ = [ c u 1 ⋮ c u n ] c\vec{u} = \begin{bmatrix} cu_1 \\ \vdots \\ cu_n \end{bmatrix} c u = c u 1 ⋮ c u n , so:
A ( u ⃗ + v ⃗ ) = ( u 1 + v 1 ) a ⃗ 1 + ⋯ + ( u n + v n ) a ⃗ n = ( u 1 a ⃗ 1 + ⋯ + u n a ⃗ n ) + ( v 1 a ⃗ 1 + ⋯ + v n a ⃗ n ) = A u ⃗ + A v ⃗ A ( c u ⃗ ) = ( c u 1 ) a ⃗ 1 + ⋯ + ( c u n ) a ⃗ n = c ( u 1 a ⃗ 1 ) + ⋯ + c ( u n a ⃗ n ) = c ( u 1 a ⃗ 1 + ⋯ + u n a ⃗ n ) = c ( A u ⃗ ) \begin{align*} A(\vec{u} + \vec{v}) &= (u_1 + v_1)\vec{a}_1 + \cdots + (u_n + v_n)\vec{a}_n \\ &= (u_1\vec{a}_1 + \cdots + u_n\vec{a}_n) + (v_1\vec{a}_1 + \cdots + v_n\vec{a}_n) \\ &= A\vec{u} + A\vec{v} \\
A(c\vec{u}) &= (cu_1)\vec{a}_1 + \cdots + (cu_n)\vec{a}_n \\ &= c(u_1\vec{a}_1) + \cdots + c(u_n\vec{a}_n) \\ &= c(u_1\vec{a}_1 + \cdots + u_n\vec{a}_n) \\ &= c(A\vec{u})
\end{align*} A ( u + v ) A ( c u ) = ( u 1 + v 1 ) a 1 + ⋯ + ( u n + v n ) a n = ( u 1 a 1 + ⋯ + u n a n ) + ( v 1 a 1 + ⋯ + v n a n ) = A u + A v = ( c u 1 ) a 1 + ⋯ + ( c u n ) a n = c ( u 1 a 1 ) + ⋯ + c ( u n a n ) = c ( u 1 a 1 + ⋯ + u n a n ) = c ( A u )
The two properties combine into a single statement:
Corollary:
For any matrix A A A , vectors u ⃗ , v ⃗ \vec{u}, \vec{v} u , v , and scalars c , d c, d c , d :
A ( c u ⃗ + d v ⃗ ) = c A u ⃗ + d A v ⃗ A(c\vec{u} + d\vec{v}) = cA\vec{u} + dA\vec{v} A ( c u + d v ) = c A u + d A v
More generally, for any linear combination, we have:
A ( c 1 v ⃗ 1 + c 2 v ⃗ 2 + ⋯ + c k v ⃗ k ) = c 1 A v ⃗ 1 + c 2 A v ⃗ 2 + ⋯ + c k A v ⃗ k A(c_1\vec{v}_1 + c_2\vec{v}_2 + \cdots + c_k\vec{v}_k) = c_1A\vec{v}_1 + c_2A\vec{v}_2 + \cdots + c_kA\vec{v}_k A ( c 1 v 1 + c 2 v 2 + ⋯ + c k v k ) = c 1 A v 1 + c 2 A v 2 + ⋯ + c k A v k