ALAFF The characteristic polynomial

Subsection 9.2.2 The characteristic polynomial

We start by discussing how to further characterize eigenvalues of a given matrix \(A \text{.}\) We say "characterize" because none of the discussed insights lead to practical algorithms for computing them, at least for matrices larger than \(4 \times 4 \text{.}\)

Homework 9.2.2.1.

Let

\begin{equation*} A = \left( \begin{array}{c c} \alpha_{0,0} \amp \alpha_{0,1} \\ \alpha_{1,0} \amp \alpha_{1,1} \end{array} \right) \end{equation*}

be a nonsingular matrix. Show that

\begin{equation*} \left( \begin{array}{c c} \alpha_{0,0} \amp \alpha_{0,1} \\ \alpha_{1,0} \amp \alpha_{1,1} \end{array} \right)^{-1} = \frac{1} {\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}} \left( \begin{array}{c c} \alpha_{1,1} \amp -\alpha_{0,1} \\ -\alpha_{1,0} \amp \alpha_{0,0} \end{array} \right) . \end{equation*}

Solution

Recall that, for square matrices, \(B = A^{-1} \) if and only if \(A B = I \text{.}\)

\begin{equation*} \begin{array}{l} \left( \begin{array}{c c} \alpha_{0,0} \amp \alpha_{0,1} \\ \alpha_{1,0} \amp \alpha_{1,1} \end{array} \right) \frac{1} {\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}} \left( \begin{array}{c c} \alpha_{1,1} \amp -\alpha_{0,1} \\ -\alpha_{1,0} \amp \alpha_{0,0} \end{array} \right) \\ ~~~=~~~~ \\ \frac{1} {\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}} \left( \begin{array}{c c} \alpha_{0,0} \alpha_{1,1} - \alpha_{0,1} \alpha_{1,0} \amp -\alpha_{0,0} \alpha_{0,1} + \alpha_{0,1} \alpha_{0,0} \\ {\alpha_{1,0} \alpha_{1,1} - \alpha_{1,1} \alpha_{1,0}} \amp - \alpha_{0,1} \alpha_{1,0} + \alpha_{0,0} \alpha_{1,1} \end{array} \right) \\ ~~~=~~~~ \\ \left( \begin{array}{c c} 1 \amp 0 \\ 0 \amp 1 \end{array} \right) . \end{array} \end{equation*}

What we notice from the last exercises is that \({\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}} \) characterizes whether \(A \) has an inverse or not:

\begin{equation*} A = \left( \begin{array}{c c} \alpha_{0,0} \amp \alpha_{0,1} \\ \alpha_{1,0} \amp \alpha_{1,1} \end{array} \right) \end{equation*}

is nonsingular if and only if \({\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}} \neq 0 \text{.}\) The expression \({\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}} \text{.}\) is known as the determinant of the \(2 \times 2 \) matrix and denoted by \(\det( A ) \text{:}\)

Definition 9.2.2.1.

The determinant of

\begin{equation*} A = \left( \begin{array}{c c} \alpha_{0,0} \amp \alpha_{0,1} \\ \alpha_{1,0} \amp \alpha_{1,1} \end{array} \right) \end{equation*}

is given by

\begin{equation*} \det( A ) = {\alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1}}. \end{equation*}

Now, \(\lambda \) is an eigenvalue of \(A \) if and only if \(\lambda I - A\) is singular. For our \(2 \times 2 \) matrix

\begin{equation*} \lambda I - A = \left( \begin{array}{c c} \lambda - \alpha_{0,0} \amp - \alpha_{0,1} \\ - \alpha_{1,0} \amp \lambda - \alpha_{1,1} \end{array} \right) \end{equation*}

is singular if and only if

\begin{equation*} ( \lambda - \alpha_{0,0} ) ( \lambda - \alpha_{1,1} ) - ( - \alpha_{1,0} ) ( - \alpha_{0,1} ) = 0. \end{equation*}

In other words, \(\lambda \) is an eigenvalue of this matrix if and only if \(\lambda \) is a root of

\begin{equation*} \begin{array}{rcl} p_A( \lambda ) \amp = \amp ( \lambda - \alpha_{0,0} ) ( \lambda - \alpha_{1,1} ) - ( - \alpha_{1,0} ) ( - \alpha_{0,1} ) \\ \amp = \amp \lambda^2 - ( \alpha_{0,0} + \alpha_{1,1} ) \lambda + ( \alpha_{0,0} \alpha_{1,1} - \alpha_{1,0} \alpha_{0,1} ), \end{array} \end{equation*}

which is a polynomial of degree two. A polynomial of degree two has two roots (counting multiplicity). This polynomial is known as the characteristic polynomial of the \(2 \times 2 \) matrix.

We now have a means for computing eigenvalues and eigenvectors of a \(2 \times 2 \) matrix:

Form the characteristic polynomial \(p_A( \lambda ) \text{.}\)
Solve for its roots, \(\lambda_0 \) and \(\lambda_1 \text{.}\)
Find nonzero vectors \(x_0 \) and \(x_1 \) in the null spaces of \(\lambda_0 I - A \) and \(\lambda_1 I - A \text{,}\) respectively.

The notion of a determinant of a matrix, \(\det( A ) \text{,}\) generalizes to \(m \times m \) matrices as does the fact that \(A \) is nonsingular if and only if \(\det( A ) \neq 0 \text{.}\) Similarly, the notion of a characteristic polynomial is then generalized to \(m \times m \) matrices:

Definition 9.2.2.2. Characteristic polynomial.

The characteristic polynomial of \(m \times m \) matrix \(A \) is given by

\begin{equation*} p_A( \lambda ) = \det( \lambda I - A ). \end{equation*}

At some point in your education, you may have been taught how to compute the determinant of an arbitrary \(m \times m \) matrix. For this course, such computations have no practical applicability, when matrices are larger than \(3 \times 3 \) or so, and hence we don't spend time on how to compute determinants. What is important to our discussions is that for an \(m \times m \) matrix \(A \) the characteristic polynomial is a polynomial of degree \(m \text{,}\) a result we formalize in a theorem without giving a proof:

Theorem 9.2.2.3.

If \(A \in \Cmxm \) then \(p_A( \lambda ) = \det( \lambda I - A ) \) is a polynomial of degree \(m \text{.}\)

This insight now allows us to further characterize the set of all eigenvalues of a given matrix:

Theorem 9.2.2.4.

Let \(A \in \C^{m \times m} \text{.}\) Then \(\lambda \in \Lambda( A ) \) if and only if \(p_A( \lambda ) = 0 \text{.}\)

Proof.

This follows from the fact that a matrix has a nontrivial null space if and only if its determinant is zero. Hence, \(p_A( \lambda ) = 0 \) if and only if there exists \(x \neq 0 \) such that \((\lambda I - A ) x = 0 \text{.}\) \((\lambda I - A ) x = 0 \) if and only if \(A x = \lambda x \text{.}\)

Recall that a polynomial of degree \(m \text{,}\)

\begin{equation*} p_m( \chi ) = \chi^m + \cdots + \gamma_1 \chi + \gamma_0, \end{equation*}

can be factored as

\begin{equation*} p_m( \chi ) = ( \chi - \chi_0 )^{m_0} \cdots ( \chi - \chi_{k-1} )^{m_k}, \end{equation*}

where the \(\chi_i \) are distinct roots, \(m_i\) equals the multiplicity of the root, and \(m_0 + \cdots + m_{k-1} = m \text{.}\) The concept of (algebraic) multiplicity carries over to eigenvalues.

Definition 9.2.2.5. Algebraic multiplicity of an eigenvalue.

Let \(A \in \Cmxm \) and \(p_m( \lambda ) \) its characteristic polynomial. Then the (algebraic) multiplicity of eigenvalue \(\lambda_i \) equals the multiplicity of the corresponding root of the polynomial.

Often we will list the eigenvalues of \(A \in \Cmxm \) as \(m \) eigenvalues \(\lambda_0, \ldots , \lambda_{m-1} \) even when some are equal (have algebraic multiplicity greater than one). In this case we say that we are counting multiplicity. In other words, we are counting each eigenvalue (root of the characteristic polynomial) separately, even if they are equal.

An immediate consequence is that \(A \) has \(m \) eigenvalues (multiplicity counted), since a polynomial of degree \(m \) has \(m \) roots (multiplicity counted), which is captured in the following lemma.

Lemma 9.2.2.6.

If \(A \in \Cmxm \) then \(A \) has \(m \) eigenvalues (multiplicity counted).

The relation between eigenvalues and the roots of the characteristic polynomial yields a disconcerting insight: A general formula for the eigenvalues of an arbitrary \(m \times m \) matrix with \(m \gt 4 \) does not exist. The reason is that "Galois theory" tells us that there is no general formula for the roots of a polynomial of degree \(m \gt 4 \) (details go beyond the scope of this course). Given any polynomial \(p_m( \chi ) \) of degree \(m \text{,}\) an \(m \times m \) matrix can be constructed such that its characteristic polynomial is \(p_m( \lambda ) \text{.}\) In particular, if

\begin{equation*} p_m( \chi ) = \chi^m + \alpha_{m-1} \chi^{m-1} + \cdots + \alpha_1 \chi + \alpha_0 \end{equation*}

and

\begin{equation*} A = \left( \begin{array}{ c c c c c c c c} - \alpha_{n-1} \amp - \alpha_{n-2} \amp - \alpha_{n-3} \amp \cdots \amp - \alpha_1 \amp - \alpha_{0} \\ 1 \amp 0 \amp 0 \amp \cdots \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \amp \cdots \amp 0 \amp 0 \\ 0 \amp 0 \amp 1 \amp \cdots \amp 0 \amp 0 \\ \vdots \amp \vdots \amp \vdots \amp \ddots \amp \vdots \amp \vdots \\ 0 \amp 0 \amp 0 \amp \cdots \amp 1 \amp 0 \\ \end{array} \right) \end{equation*}

then

\begin{equation*} p_m( \lambda ) = \det( \lambda I - A ) . \end{equation*}

(Since we don't discuss how to compute the determinant of a general matrix, you will have to take our word for this fact.) Hence, we conclude that no general formula can be found for the eigenvalues for \(m \times m \) matrices when \(m \gt 4 \text{.}\) What we will see is that we will instead create algorithms that converge to the eigenvalues and/or eigenvectors of matrices.

Corollary 9.2.2.7.

If \(A \in \R^{m \times m} \) is real-valued then some or all of its eigenvalues may be complex-valued. If eigenvalue \(\lambda \) is complex-valued, then its conjugate, \(\bar \lambda \text{,}\) is also an eigenvalue. Indeed, the complex eigenvalues of a real-valued matrix come in complex pairs.

Proof.

It can be shown that if \(A \) is real-valued, then the coefficients of its characteristic polynomial are all real -valued. Complex roots of a polynomial with real coefficients come in conjugate pairs.

The last corollary implies that if \(m \) is odd, then at least one eigenvalue of a real-valued \(m \times m \) matrix must be real-valued.

Corollary 9.2.2.8.

If \(A \in \R^{m \times m} \) is real-valued and \(m\) is odd, then at least one of the eigenvalues of \(A\) is real-valued.

It would seem that the natural progression for computing eigenvalues and eigenvectors would be

Form the characteristic polynomial \(p_A( \lambda ) \text{.}\)
Solve for its roots, \(\lambda_0, \ldots, \lambda_{m-1} \text{,}\) which give us the eigenvalues of \(A \text{.}\)
Find eigenvectors associated with the eigenvalues by finding bases for the null spaces of \(\lambda_i I - A \text{.}\)

However, as mentioned, finding the roots of a polynomial is a problem. Moreover, finding vectors in the null space is also problematic in the presence of roundoff error. For this reason, the strategy for computing eigenvalues and eigenvectors is going to be to compute approximations of eigenvectors hand in hand with the eigenvalues.