Linear Algebra Recap ==================== This is not a chapter where you can learn linear algebra from scratch. It is meant as a way to refresh your linear algebra knowledge. We will only consider the canonical finite dimensional vector space of vectors in $\setR^n$. Vectors ------- .. sidebar:: **Real Vector Space** A *real vector space* is a set $V$ of *vectors* with two operators (addition and scalar multiplication such that $\v x,\v y\in V: \v x+\v y\in V$ and $\v x\in V, a\in\setR:a\v x\in V$) and obeying the following requirements: 1. Associativity of addition: $\v x + (\v y + \v z) = (\v x + \v y) + \v z$ 2. Commutativity of addition: $\v x + \v y = \v y + \v x$ 3. Identity element of addition (*zero vector*): $\exists \v 0 \in V: \forall \v x\in V: \v x + \v 0 = \v x$ 4. Inverse element of addition: $\forall \v x\in V: \exists -\v x\in V: \v x + (-\v x) = \v 0$ 5. Distributibity of scalar multiplication: * with respect to vector addition: $a(\v x+\v y)=a\v x + a\v y$ * with respect to scalar addition: $(a+b)\v x = a\v x+b\v x$ 6. Compatibility of scalar and vector multiplication: $a(b)\v x= (ab)\v x$ 7. Identity element of scalar multiplication: $1 \v x = \v x$ In this general setting (and in fact the scalars may come out of different *fields* like the complex numbers) vectors need not be the $n$-tuples with real numbers. We may also define the vector space of all *functions* and define the vector addition as $(f+g)(x)=f(x)+g(x)$ (note that the functions $f$ and $g$ are the vectors here!). The vector space of functions is stil a *real* vector space: the name A vector $\v x$ in $\setR^n$ can be represented with an $n$-element vector .. math:: \v x = \matvec{c}{x_1\\x_2\\ \vdots\\ x_n} each element (also called component or coordinate) $x_i$ is a real value. To sharpen your intuition for vectors consider a vector $\v x$ in two dimensions with elements $a$ and $b$. This vector can be interpreted as either: .. hier plaatje van de punt en pijl interpretatie *a point* $a$ and $b$ are then the coordinates of a point $(a,b)$ in the plane. *an arrow* $\v x$ denotes the arrow starting in the origin (the point with coordinates $(0,0)$) and pointing to the point $(a,b)$. These interpretations will both be often used in practice. Linear *algebra* is called an *algebra* because it defines way to make calculations with vectors and studies the properties of vectors and calculations with vectors. The **addition** of two vectors $\v x$ and $\v y$ results in the vector $\v z$ whose elements are the sum of the corresponding elements of the vectors $\v x$ and $\v y$: .. math:: \v z &= \v x + \v y\\ \matvec{c}{z_1\\z_2\\ \vdots\\ z_n}&= \matvec{c}{x_1\\x_2\\ \vdots\\ x_n} + \matvec{c}{y_1\\y_2\\ \vdots\\ y_n} .. plaatje van vector additie in 2D The addition of two vectors is easily interpreted as follows (we do a 2D example): take the arrow $\v x$ and draw it starting at the origin, next draw the arrow $\v y$ but start at the end point of $\v x$. These 'concatenated' arrows then end at the point $\v z$. The **scalar multiplication** of a vector $\v x$ with a scalar $\alpha$ (i.e. a real value) is defined as the elementwise multiplication with $\alpha$: .. math:: \alpha \v x = \alpha \matvec{c}{x_1\\x_2\\ \vdots\\ x_n} = \matvec{c}{\alpha x_1\\ \alpha x_2\\ \vdots\\ \alpha x_n} Consider again the two dimensional example of the vector $\v x$ with elements $a$ and $b$. The vector $\alpha \v x$, with $\alpha>0$ then is a vector *in the same direction* as $\v x$ but with a length that is $\alpha$ times the original lenght. For $\alpha>1$ the length is increased, for $0<\alpha<1$ the length is decreased. But what happens when we take a negative $\alpha$? Consider the case: .. math:: -1 \v x = -\v x = -\matvec{c}{a\\b} = \matvec{c}{-a\\-b} Indeed the direction is reversed, the vector now points in the opposite direction. Note that **vector subtraction** $\v x - \v y$ simply is $\v x + (-\v y)$. The **zero-vector** $\v 0$ is the vector of which all elements are zero. Remember from classical algebra that for any scalar $a$ we have $a+0=a$. For the zero vector we have that for each vector $\v x$ it is true that: $\v x + \v 0=\v x$. Vector addition is **commutative**, i.e. $\v x + \v y = \v y + \v x$. Vector addition is **associative**, i.e. $(\v x + \v y) + \v z=\v x + (\v y + \v z)$. Both these properties together imply that in a summation of vectors the order and sequence of the additions are irrelevant: you may leave out all brackets and change the order of vectors. Basis Vectors ------------- .. sidebar:: **Basis** A vector space $V$ is an $n$-dimensional vector space in case $n$ linear independent vectors $\v b_1,\ldots,\v b_n$ exist such that *any* element of $V$ can uniquely be written as a linear combination of these *basis vectors*: $\v x = x_1 \v b_1 + \cdots + x_n \v b_n$. We say that the set of vectors $\v b_1,\ldots,\v b_n$ *span* the entire vector space $V$. For any $n$ dimensional real vector space each vector in $V$ can be represented with $n$ real numbers: its *coordinates*. Carefully note that the coordinate representation of a vector thus depends on the basis that is, implicitly, used. And yes... infinite dimensional vector spaces do exist! Consider the 2D vector $\v x=\matvec{cc}{3&2}\T$ [#transpose]_. When we draw it in a standard coordinate frame we mark the point that is 3 standard units from the right of the origin and 2 units above the origin. Note that the vector $\v e_1 = \matvec{c}{1&0}\T$ is pointing one unit in the horizontal direction and the vector $\v e_2=\matvec{c}{0&1}\T$ is pointing one unit in the vertical direction. The vector $\v x$ may thus be written as: .. math:: \matvec{cc}{3\\2} &= 3\matvec{cc}{1\\0} + 2\matvec{cc}{0\\1}\\ &= 3 \v e_1 + 2 \v e_2 In general a vector $\matvec{cc}{a&b}\T$ can be written as $a\v e_1+b\v e_2$. The vectors $\v e_1$ and $\v e_2$ are said to form a **basis** of $\setR^2$: any vector in $\setR^2$ can be written as the linear combination of the basis vectors. But are the vectors $\v e_1$ and $\v e_2$ unique in that sense? Aren't there other vectors say $\v b_1$ and $\v b_2$ with the property that any vector can be written as a linear combination of these vectors? Yes there are! Any two vectors $\v b_1$ and $\v b_2$ that do not point in the same (or opposite) direction can be used as a basis. For instance take $\v b_1=\matvec{cc}{1&1}$ and $\v b_2 = \matvec{cc}{-1&1}$ then we have that: .. math:: \matvec{c}{3\\2} = X \matvec{c}{1\\1} + Y \matvec{c}{-1\\1} as can be checked by working out the vector expression in the right hand side of the above equality. .. coordinates versus vectors from 2D to nD - linear independence coordinate transforms Linear Mappings --------------- .. sidebar:: **Linear Mappings in Vector Space** Note that although this section on linear mappings uses examples of $V=\setR^2$ the definitions and results are generic. They apply to any finite dimensional vector space. A linear mapping from a finite dimensional real vector space $V$ to another finite dimensional real vector space $V'$ can be represented with an $n\times n$ matrix. Again carefully note that the matrix representation depends on the bases that are chosen for both $V$ and $V'$. A linear mapping is a mapping $\op L$ that takes a vector from $\setR^n$ and maps it on a vector from $\setR^m$ such that: .. math:: &\op L (\v x + \v y) = \op L\v x + \op L\v y\\ &\op L (\alpha \v x) = \alpha \op L \v x Consider a mapping $\op L$ that takes vectors from $\setR^2$ and maps these on vectors from $\setR^2$ as well. Remember that any vector in $\setR^2$ can be written as $\v x = x_1 \v e_1 + x_2 \v e_2$, then: .. math:: \op L \v x &= \op L (x_1 \v e_1 + x_2 \v e_2)\\ &= x_1 \op L \v e_1 + x_2 \op L \v e_2 Note that $\op L \v e_1$ is the result of applying the mapping on vector $\v e_1$, by choice this is a vector in $\setR^2$ as well and therefore there must be two numbers $L_{11}$ and $L_{21}$ such that: .. math:: \op L \v e_1 = L_{11} \v e_1 + L_{21} \v e_2 Equivalently: .. math:: \op L \v e_2 = L_{12} \v e_1 + L_{22} \v e_2 So we get: .. math:: \op L \v x &= x_1 \op L \v e_1 + x_2 \op L \v e_2\\ &= x_1 (L_{11} \v e_1 + L_{21} \v e_2) + x_2 (L_{12} \v e_1 + L_{22} \v e_2)\\ &= (L_{11} x_1 + L_{12} x_2 ) \v e_1 + (L_{21} x_1 + L_{22} x_2) \v e_2 Let $\op L\v x = \matvec{cc}{x'_1&x'_2}\T$ be the coordinates with respect to the elementary basis, then we have: .. math:: \matvec{c}{x'_1\\x'_2} = \matvec{c}{L_{11} x_1 + L_{12} x_2\\L_{21} x_1 + L_{22} x_2} The four scalars $L_{ij}$ for $i,j=1,2$ thus completely define this linear mapping. It is customary to write these scalars in a **matrix**: .. math:: L = \matvec{cc}{L_{11}&L_{12}\\L_{21}&L_{22}} For now a matrix is a symbolic notation, we will turn matrices and vectors into a real algebra in the next subsection when we consider calculations with matrices and vectors. With the matrix $L$ we can define the matrix-vector product that describes the relation between the original vector $\v x$ and the resulting vector $\v x'$: .. math:: \matvec{c}{x'_1\\x'_2} = \matvec{cc}{L_{11}&L_{12}\\L_{21}&L_{22}}\matvec{c}{x_1\\x_2} or equivalently: .. math:: \v x' = L \v x Note that we may identify a linear mapping $\op L$ with its matrix $L$ (as is often done) but then carefully note that the basis for the input vectors and the basis for the output vectors are implicitly defined. Matrices -------- A matrix $A$ is a rectangular array of numbers. The elements $A_{ij}$ in the matrix are indexed with the row number $i$ and the column number $j$. The size of a matrix is denoted as $m\times n$ where $m$ is the number of rows and $n$ is the number of columns. **Matrix Addition**. Two matrices $A$ and $B$ can only be added together in case both are of size $m\times n$. Then we have: .. math:: C &= A + B\\ C_{ij} &= A_{ij} + B_{ij} .. sidebar:: **Matrix Product** .. figure:: /images/Matrix_multiplication_diagram_2.png :width: 100% Figure from Wikipedia. **Matrix Multiplication** Two matrices $A$ and $B$ can be multiplied to give $C=AB$ in case $A$ is of size $m\times n$ and $B$ is of size $n\times p$. The resulting matrix $C$ is of size $m\times p$. The matrix multiplication is defined as: .. math:: C &= AB\\ C_{ij} &= \sum_{k=1}^{n} A_{ik} B_{kj} **Matrix-Vector Multiplication**. We may consider a vector $\v x$ in $\setR^n$ as a matrix of size $n\times 1$. In case $A$ is a matrix of size $m\times n$ the matrix-vector multiplication $A\v x$ results in a vector in $m$ dimensional space (or a matrix of size $m\times 1$. In a previous section we have already seen that any linear mapping from $n$ dimensional vector space to $n$ dimensional vector space can be represented with a $m\times n$ matrix. **Dot Product**. Consider two vectors $\v x$ and $\v y$ both in $\setR^n$, then the dot product is defined as the sum of the multiplications of the elements of both vectors: .. math:: \v x \cdot \v y = \sum_{i=1}^n x_i y_i note that we can also write: .. math:: \v x \cdot \v y = \v x\T\v y = \v y\T \v x **Matrix Transpose**. Consider a matrix $A$ of size $m\times n$, then the *transpose* $A\T$ is the matrix where the rows and columns of $A$ are interchanged: .. math:: (A\T)_{ij} = A_{ji} note that $A\T$ has size $n\times m$. The transpose is the operator that turns a column vector into a row vector (and vice versa). **Symmetric Matrices**. In case $A\T = A$ the matrix $A$ is called symmetric. **Square Matrices**. A matrix is square if $m=n$. **Diagonal Matrix**. A square matrix with only non zero elements on the main diagonal (i.e. the elements $A_{ii}$). **Identity Matrix**. The diagonal matrix with all diagonal elements equal to one. The diagonal matrix of size $m\times m$ is denoted as $I_m$ (the size subscript is omitted in case this is clear from the context). Interpreted as a linear mapping, the identity matrix corresponds with the identity mapping (i.e. $I\v x = \v x$ for all $\v x$). Linear Equations, Determinants and Inverse Matrices --------------------------------------------------- The matrix-vector equation $A\v x= \v b$ where $A$ is a known matrix of size $m\times n$, $\v x$ is the unknown column vector of size $n\times 1$ and $\v b$ is a known vector of size $m\times 1$, defines a set of $m$ equations in $n$ unknowns: .. math:: A_{11} x_1 + A_{12} x_2 + \cdots + A_{1n} x_n &= b_1\\ A_{21} x_1 + A_{22} x_2 + \cdots + A_{2n} x_n &= b_2\\ \vdots\\ A_{m1} x_1 + A_{12} x_2 + \cdots + A_{mn} x_n &= b_m A necessary condition for such a set of equations to have a unique solution is that $m=n$, i.e. is $A$ is a **square matrix**. In case the **determinant** $|A|$ is not equal to zero the matrix is called **non-singular** and an **inverse matrix** $A\inv$ exists. For the inverse of a matrix $A$ (if it exists) we have that $A A\inv = A\inv A = I$. A set of linear equations and represented with $A\v x = \v b$ thus has a solution $\v x = A\inv \v b$. Please note that solving a set of linear equation by calculating the inverse matrix is not a wise thing to do. Better ways to solve such a system are known and available in most linear algebra packages. Vector Norm, Inner Products and Orthogonality --------------------------------------------- Let $\v x$ be a vector in 2D space with coordinate representation $\v x = \matvec{cc}{x_1&x_2}\T$ with respect to orthogonal basis vectors of unit length (i.e. an **orthonormal basis**), then we may use Pythogoras' theorem to calculate the length of the vector, which is denoted as $\|\v x\|$ and is called the **vector norm**: .. math:: \|\v x\| = \sqrt{x_1^2+x_2^2} The **inner product** $\langle \v x, \v y \rangle$ of two vector $\v x$ and $\v y$ relates the length of the vectors with the angle between the vectors: .. math:: \langle \v x, \v y \rangle = \|\v x\| \|\v y\| \cos\theta In a vector space where the coordinates are relative to an orthonormal basis we have that .. math:: \|\v x\| = \sqrt{\langle \v x, \v x \rangle} In case the angle between the two vectors is $90$ degrees (or $\half\pi$ radians), i.e. the vectors are **orthogonal**, we have that $\v x\T\v y=0$. An $n\times n$ matrix $A$ whose columns form an *orthonormal* basis is called an **orthogonal matrix**. Interpreted as linear mappings an orthogonal matrix preserves the length (norm) of vectors: $\|A\v x\|=\|\v x\|$. The **inverse of an orthogonal matrix** equals the transpose: $A\inv = A\T$. Eigenvectors and Eigenvalues ---------------------------- Consider a linear mapping from $\setR^n$ onto itself. Such a mapping can be represented with a square $n\times n$ matrix $A$. In case for a specific vector $\v x$ we have that: .. math:: A\v x = \lambda\v x we say that $\v x$ is an **eigenvector** of $A$ with corresponding **eigenvalue** $\lambda$. Let $A$ be a symmetric $n\times n$ matrix then it can be shown that there are $n$ eigenvectors that form an orthonormal basis of $A$. Let $\v u_i$ for $i=1,\ldots,n$ be the $n$ eigenvectors with corresponding eigenvalues $\lambda_i$ then we can write: .. math:: A = U D U\inv where $U = \matvec{c}{\v u_1,\ldots,\v u_n}$ is the matrix whose columns are the $n$ eigenvectors and $D$ is a diagonal matrix whose entries are the corresponding eigenvalues. The above equation is called the **eigenvalue decomposition**. Singular Value Decomposition ---------------------------- Given any matrix $A$, the **singular value decomposition (SVD)** writes the matrix as a product of a orthogonal matrix $U$, a 'diagonal' matrix $D$ and a second orthogonal matrix $V$: .. math:: \underbrace{A}_{m\times n} = \underbrace{U}_{m\times m} \quad \underbrace{D}_{m\times n}\quad \underbrace{V\T}_{n\times n} Evidently the matrix $D$ cannot be truly diagonal in general as $A$ is not necessarily a square matrix. It is nevertheless called diagonal because only the elements $D_{ii}$ can be different from zero. ---- .. [#transpose] The $\v x\T$ denotes the *transpose* of a vector. For now you can read it as a way to make a column vector out of a row vector.