Vector, Norm, Angle

For a vector $x\in R^p$,

$$ L_x = \sqrt{x^Tx} = ||x||_2 $$

The angle between two vectors $x, y \in R^p$ is

$$ cos(\theta)=\frac{x^Ty}{L_x L_y} $$

$cos(\theta)=0 \Rightarrow x^Ty=0$, i.e. they are orthogonal.

Unit Vector

$x^*=\frac{x}{L_x}$ has length 1 and is in the same direction as $x$.

Orthonormal

Two vectors are orthonormal if $L_{x1}=L_{x2}=1$ and $x_1^Tx_2=0$.

Inner Product

An inner product $<x,y>$ is a binary operator that satisfies:

Linearity in the first argument
Conjugate symmetry: $<x,y> = \bar{<y,x>}$
Positive definite: $<x,x>$ is positive for $x\neq 0$.

For real numbers, the inner product is just standard multiplication.

In the Euclidean vector space, the inner product is the dot product.

Cauchy-Schwarz Inequality

For all vectors $u$ and $v$ of an inner product space, it is true that

$$ |<u,v>|^2\leq \,<u,u>\times<v,v> $$

Matrix

Geometric Interpretation

A matrix is a linear transformation. $Ax$ means applying the linear transformation on vector $x$. $BAx$ means first apply transformation $A$, then apply transformation $B$ on $x$.

Orthogonal Matrix

A square $p\times p$ matrix $Q$ is orthogonal if $QQ^T=Q^TQ=I_p$.

In other words, $Q^T=Q^{-1}$.

Its column (row) vectors are orthonormal.

The inverse of an orthogonal matrix is orthogonal.

Positive-definite Matrix

An $n\times n$ symmetric matrix $M$ is positive-definite

↔ $x^TMx>0$ for all $x\in R^n$

↔ All eigenvalues of $M$ is non-negative

↔ $A=U^T U$ for some matrix $U$

The covariance matrix is always positive semi-definite. And it is positive-definite unless one column is a linear combination of the others.

Matrix Multiplication

Associativity (order does not matter): $ABC=A(BC)$.
$a\times b$ multiplies $b \times c$ = $O(abc)$ time complexity.

Indempotent Matrix / Projection

A matrix is indempotent if $P^2=P$.

Covariance Matrix

A covariance matrix $A=X^TX$ is positive semi-definite.

Rank and Nullity

Rank

Dimension of the subspace spanned by the column or row vectors.

Column rank = Row rank.

Null Space

The subspace spanned by vectors satisfying $Ax=0$ and $x\neq0$.

Nullity is the dimension of null space.

Rank-nullity theorem

$rank(A)+nullity(A)=n$ for a n-dimensional matrix $A$.

Determinant

Geometric Interpretation

The determinant represents by how much will the linear transformation scale the original space (e.g. for $2\times 2$ matrix $A$, $|\det(A)|$ represents the area of a unit square after transformation).

$\det(A)<0$ → the transformation inverts the space.
$\det(A)=0$ → the transformation reduces the dimensionality of the space (e.g. turns an $R^2$ space into a line). This explains why $A^{-1}$ does not exist: there is no transformation from $R\rightarrow R^2$.

Property

$\det(AB)=det(A)det(B)$

Inverse Matrix

$rank(A) < n \Rightarrow det(A)=0 \Rightarrow$ $A$ is not invertible.

$(AB)^{-1}=B^{-1}A^{-1}$
$(A^T)^{-1}=(A^{-1})^T$

Eigenvalues and Eigenvectors

A matrix $A$ has eigenvector $v$ and eigenvalue $\lambda$ if

$$ Av=\lambda v $$

The characteristic polynomial is

$$ p(\lambda)=\det(A-\lambda I_n)=0\,\,\newline\Rightarrow\,\,(\lambda-\lambda_1)^{n_1}(\lambda-\lambda_2)^{n_2}...(\lambda-\lambda_{N_\lambda})^{n_{N_\lambda}}=0 $$

where $N_\lambda$ is the number of unique eigenvalues. $n_1,...,n_{N_\lambda}$ are the algebraic multiplicities of $\lambda_1,...,\lambda_{N_\lambda}$.

Spectral Decomposition

$A$ is a symmetric and square matrix, then all eigenvalues are real and all eigenvectors can be chosen to be orthonormal.

$$ A=\sum_{j=1}^p \lambda_j e_j e_j^T=P\Lambda P^T $$

where $P$ is the matrix of eigenvectors and $\Lambda$ is the diagonal matrix of eigenvalues.

Property

$A^{-1}=P\Lambda^{-1}P^T$
$A^{1/2}=P\Lambda ^{1/2}P \,\,\Rightarrow \,\,A^{1/2}A^{1/2}=A$
$A+I_p=P(\Lambda+I_p)P^T$

Singular Value Decomposition (SVD)

$$ X = U\Sigma V^T $$

where $\Sigma \in R^{n\times m}$ is a diagonal matrix with non-decreasing singular values $\sigma_i$. The square singular values are the eigenvalues of the matrix $X^TX$, i.e. $\sigma_i=\sqrt{\lambda_i(X^TX)}$.

Trace

For a square matrix $A$, $tr(A)=\sum_{i=1}^p a_{ii}$ is the sum of its diagonal elements. It's also the sum of its eigenvalues counted with multiplicities:

$$ tr(A)=\sum_{i=1}^p d_i\lambda_i $$

Property

Linear operation
Trace of a product: $tr(A^TB)=tr(AB^T)=tr(B^TA)=tr(BA^T)=\sum_{i,j}a_{ij}b_{ij}$
Cyclic property: $tr(ABCD)=tr(BCDA)=...$

Matrix Calculus

For a $p\times p$ matrix $A$ and $x=(x_1, ..., x_p)^T$:

$\frac{\delta (x^TAx)}{\delta x}=(A^T+A)x$

If $y=f(A)$ is a scalar function of $A$:

$\frac{\delta |A|}{\delta A}=|A|(A^T)^{-1}$
$\frac{\delta x^TAx}{\delta A} = -(A^{-1})^Txx^T(A^{-1})^T$

Return to the start of this page