Home » Math basics » Linear algebra » What are eigenvectors and eigenvalues?

What are eigenvectors and eigenvalues?

Contents

1 Introduction
2 Calculating the eigenvalues
3 Calculating the first eigenvector
4 Calculating the second eigenvector
5 Conclusion

Introduction

Eigenvectors and eigenvalues have many important applications in computer vision and machine learning in general. Well known examples are PCA (Principal Component Analysis) for dimensionality reduction or EigenFaces for face recognition. An interesting use of eigenvectors and eigenvalues is also illustrated in my post about error ellipses. Furthermore, eigendecomposition forms the base of the geometric interpretation of covariance matrices, discussed in an more recent post. In this article, I will provide a gentle introduction into this mathematical concept, and will show how to manually obtain the eigendecomposition of a 2D square matrix.

An eigenvector is a vector whose direction remains unchanged when a linear transformation is applied to it. Consider the image below in which three vectors are shown. The green square is only drawn to illustrate the linear transformation that is applied to each of these three vectors.

Eigenvectors (red) do not change direction when a linear transformation (e.g. scaling) is applied to them. Other vectors (yellow) do.

The transformation in this case is a simple scaling with factor 2 in the horizontal direction and factor 0.5 in the vertical direction, such that the transformation matrix $A$ is defined as:

$A=\begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix}$ .

A vector $\vec{v}=(x,y)$ is then scaled by applying this transformation as $\vec{v}\prime = A\vec{v}$ . The above figure shows that the direction of some vectors (shown in red) is not affected by this linear transformation. These vectors are called eigenvectors of the transformation, and uniquely define the square matrix $A$ . This unique, deterministic relation is exactly the reason that those vectors are called ‘eigenvectors’ (Eigen means ‘specific’ in German).

In general, the eigenvector $\vec{v}$ of a matrix $A$ is the vector for which the following holds:

(1) $\begin{equation*} A \vec{v} = \lambda \vec{v} \end{equation*}$

where $\lambda$ is a scalar value called the ‘eigenvalue’. This means that the linear transformation $A$ on vector $\vec{v}$ is completely defined by $\lambda$ .

We can rewrite equation (1) as follows:

(2) $\begin{eqnarray*} A \vec{v} - \lambda \vec{v} = 0 \\ \Rightarrow \vec{v} (A - \lambda I) = 0, \end{eqnarray*}$

where $I$ is the identity matrix of the same dimensions as $A$ .

However, assuming that $\vec{v}$ is not the null-vector, equation (2) can only be defined if $(A - \lambda I)$ is not invertible. If a square matrix is not invertible, that means that its determinant must equal zero. Therefore, to find the eigenvectors of $A$ , we simply have to solve the following equation:

(3) $\begin{equation*} Det(A - \lambda I) = 0. \end{equation*}$

In the following sections we will determine the eigenvectors and eigenvalues of a matrix $A$ , by solving equation (3). Matrix $A$ in this example, is defined by:

(4) $\begin{equation*} A = \begin{bmatrix} 2 & 3 \\ 2 & 1 \end{bmatrix}. \end{equation*}$

Calculating the eigenvalues

To determine the eigenvalues for this example, we substitute $A$ in equation (3) by equation (4) and obtain:

(5) $\begin{equation*} Det\begin{pmatrix}2-\lambda&3\\2&1-\lambda\end{pmatrix}=0. \end{equation*}$

Calculating the determinant gives:

(6) $\begin{align*} &(2-\lambda)(1-\lambda) - 6 = 0\\ \Rightarrow &2 - 2 \lambda - \lambda - \lambda^2 -6 = 0\\ \Rightarrow &{\lambda}^2 - 3 \lambda -4 = 0. \end{align*}$

To solve this quadratic equation in $\lambda$ , we find the discriminant:

$\begin{equation*} D = b^2 -4ac = (-3)^2 -4*1*(-4) = 9+16 = 25. \end{equation*}$

Since the discriminant is strictly positive, this means that two different values for $\lambda$ exist:

(7) $\begin{align*} \lambda _1 &= \frac{-b - \sqrt{D}}{2a} = \frac{3-5}{2} = -1,\\ \lambda _2 &= \frac{-b + \sqrt{D}}{2a} = \frac{3+5}{2} = 4. \end{align*}$

We have now determined the two eigenvalues $\lambda_1$ and $\lambda_2$ . Note that a square matrix of size $N \times N$ always has exactly $N$ eigenvalues, each with a corresponding eigenvector. The eigenvalue specifies the size of the eigenvector.

Calculating the first eigenvector

We can now determine the eigenvectors by plugging the eigenvalues from equation (7) into equation (1) that originally defined the problem. The eigenvectors are then found by solving this system of equations.

We first do this for eigenvalue $\lambda_1$ , in order to find the corresponding first eigenvector:

$\begin{equation*} \begin{bmatrix}2&3\\2&1\end{bmatrix} \begin{bmatrix}x_{11}\\x_{12}\end{bmatrix} = -1 \begin{bmatrix}x_{11}\\x_{12}\end{bmatrix}. \end{equation*}$

Since this is simply the matrix notation for a system of equations, we can write it in its equivalent form:

(8) $\begin{eqnarray*} \left\{ \begin{array}{lr} 2x_{11} + 3x_{12} = -x_{11}\\ 2x_{11} + x_{12} = -x_{12} \end{array} \right. \end{eqnarray*}$

and solve the first equation as a function of $x_{12}$ , resulting in:

(9) $\begin{equation*} x_{11} = -x_{12}. \end{equation*}$

Since an eigenvector simply represents an orientation (the corresponding eigenvalue represents the magnitude), all scalar multiples of the eigenvector are vectors that are parallel to this eigenvector, and are therefore equivalent (If we would normalize the vectors, they would all be equal). Thus, instead of further solving the above system of equations, we can freely chose a real value for either $x_{11}$ or $x_{12}$ , and determine the other one by using equation (9).

For this example, we arbitrarily choose $x_{12} = 1$ , such that $x_{11}=-1$ . Therefore, the eigenvector that corresponds to eigenvalue $\lambda_1 = -1$ is

(10) $\begin{equation*} \vec{v}_1 = \begin{bmatrix} -1 \\ 1 \end{bmatrix}. \end{equation*}$

Calculating the second eigenvector

Calculations for the second eigenvector are similar to those needed for the first eigenvector;
We now substitute eigenvalue $\lambda_2=4$ into equation (1), yielding:

(11) $\begin{equation*} \begin{bmatrix}2&3\\2&1\end{bmatrix} \begin{bmatrix}x_{21}\\x_{22}\end{bmatrix} = 4 * \begin{bmatrix}x_{21}\\x_{22}\end{bmatrix}. \end{equation*}$

Written as a system of equations, this is equivalent to:

(12) $\begin{eqnarray*} \left\{ \begin{array}{lr} 2x_{21} + 3x_{22} = 4x_{21}\\ 2x_{21} + x_{22} = 4x_{22} \end{array} \right. \end{eqnarray*}$

Solving the first equation as a function of $x_{21}$ resuls in:

(13) $\begin{equation*} x_{22} = \frac{3}{2}x_{21} \end{equation*}$

We then arbitrarily choose $x_{21}=2$ , and find $x_{22}=3$ . Therefore, the eigenvector that corresponds to eigenvalue $\lambda_2 = 4$ is

(14) $\begin{equation*} \vec{v}_2 = \begin{bmatrix} 3 \\ 2 \end{bmatrix}. \end{equation*}$

Conclusion

In this article we reviewed the theoretical concepts of eigenvectors and eigenvalues. These concepts are of great importance in many techniques used in computer vision and machine learning, such as dimensionality reduction by means of PCA, or face recognition by means of EigenFaces.

If you’re new to this blog, don’t forget to subscribe, or follow me on twitter!