ALAFF Singular matrices and the eigenvalue problem

Subsection 9.2.1 Singular matrices and the eigenvalue problem

Definition 9.2.1.1. Eigenvalue, eigenvector, and eigenpair.

Let \(A \in \C^{m \times m} \text{.}\) Then \(\lambda \in \C \) and nonzero \(x \in \C^{m} \) are said to be an eigenvalue and corresponding eigenvector if \(A x = \lambda x \text{.}\) The tuple \(( \lambda, x ) \) is said to be an eigenpair.

\(A x = \lambda x \) means that the action of \(A \) on an eigenvector \(x \) is as if it were multiplied by a scalar. In other words, the direction does not change and only its length is scaled. "Scaling" and "direction" should be taken loosely here: an eigenvalue can be negative (in which case the vector ends up pointing in the opposite direction) or even complex-valued.

As part of an introductory course on linear algebra, you learned that the following statements regarding an \(m \times m\) matrix \(A \) are all equivalent:

\(A \) is nonsingular.
\(A \) has linearly independent columns.
There does not exist a nonzero vector \(x \) such that \(A x = 0 \text{.}\)
\(\Null( A ) = \{ 0 \} \text{.}\) (The null space of \(A \) is trivial.)
\(\dim(\Null( A )) = 0 \text{.}\)
\(\det( A ) \neq 0 \text{.}\)

Since \(A x = \lambda x \) can be rewritten as \(( \lambda I - A ) x = 0 \text{,}\) we note that the following statements are equivalent for a given \(m \times m \) matrix \(A \text{:}\)

There exists a vector \(x \neq 0 \) such that \(( \lambda I - A ) x = 0 \text{.}\)
\(( \lambda I - A ) \) is singular.
\(( \lambda I - A ) \) has linearly dependent columns.
The null space of \(\lambda I - A \) is nontrivial.
\(\dim(\Null( \lambda I - A )) \gt 0 \text{.}\)
\(\det( \lambda I - A ) = 0 \text{.}\)

It will become important in our discussions to pick the right equivalent statement in a given situation.

We will often talk about "the set of all eigenvalues." This set is called the spectrum of a matrix.

Definition 9.2.1.2. Spectrum of a matrix.

The set of all eigenvalues of \(A \) is denoted by \(\Lambda( A ) \) and is called the spectrum of \(A \text{.}\)

The magnitude of the eigenvalue that is largest in magnitude is known as the spectral radius. The reason is that all eigenvalues lie in the circle in the complex plane, centered at the origin, with that radius.

Definition 9.2.1.3. Spectral radius.

The spectral radius of \(A \text{,}\) \(\rho( A ) \text{,}\) equals the absolute value of the eigenvalue with largest magnitude:

\begin{equation*} \rho( A ) = \max_{\lambda \in \Lambda( A )} \vert \lambda \vert. \end{equation*}

In Subsection 7.3.3 we used the spectral radius to argue that the matrix that comes up when finding the solution to Poisson's equation is nonsingular. Key in that argument is a result known as the Gershgorin Disk Theorem.

Theorem 9.2.1.4. Gershgorin Disk Theorem.

Let \(A \in \mathbb C^{m \times m} \text{,}\)

\begin{equation*} A = \left( \begin{array}{c c c c} \alpha_{0,0} \amp \alpha_{0,1} \amp \cdots \amp \alpha_{0,m-1} \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \cdots \amp \alpha_{1,m-1} \\ \vdots \amp \vdots \amp \amp \vdots \\ \alpha_{m-1,0} \amp \alpha_{m-1,1} \amp \cdots \amp \alpha_{m-1,m-1} \end{array} \right), \end{equation*}

\begin{equation*} \rho_i(A) = \sum_{j \neq i} \vert \alpha_{i,j} \vert, \end{equation*}

and

\begin{equation*} R_i(A) = \{ x ~\mbox{s.t.}~ \vert x - \alpha_{i,i} \vert \leq \rho_i \}. \end{equation*}

In other words, \(\rho_i(A) \) equals the sum of the absolute values of the off diagonal elements of \(A \) in row \(i \) and \(R_i(A) \) equals the set of all points in the complex plane that are within a distance \(\rho_i \) of diagonal element \(\alpha_{i,i} \text{.}\) Then

\begin{equation*} \Lambda( A ) \subset \cup_{i} R_i(A). \end{equation*}

In other words, the eigenvalues lie in the union of these disks.

Proof.

Let \(\lambda \in \Lambda( A ) \text{.}\) Then \(( \lambda I - A ) x = 0 \) for some nonzero vector \(x \text{.}\) W.l.o.g. assume that index \(i \) has the property that \(1 = \chi_i \geq \vert \chi_j \vert \) for \(j \neq i \text{.}\) Then

\begin{equation*} - \alpha_{i,0} \chi_0 - \cdots - \alpha_{i,i-1} \chi_{i-1} + ( \lambda - \alpha_{i,i} ) - \alpha_{i,i+1} \chi_{i+1} - \cdots - \alpha_{i,m-1} \chi_{m-1} = 0 \end{equation*}

or, equivalently,

\begin{equation*} \lambda - \alpha_{i,i} = \alpha_{i,0} \chi_0 + \cdots + \alpha_{i,i-1} \chi_{i-1} + \alpha_{i,i+1} \chi_{i+1} + \cdots + \alpha_{i,m-1} \chi_{m-1} . \end{equation*}

Hence

\begin{equation*} \begin{array}{l} \vert \lambda - \alpha_{i,i} \vert \\ ~~~=~~~~ \\ \vert \alpha_{i,0} \chi_0 + \cdots + \alpha_{i,i-1} \chi_{i-1} + \alpha_{i,i+1} \chi_{i+1} + \cdots + \alpha_{i,m-1} \chi_{m-1} \vert \\ ~~~ \leq ~~~~ \\ \vert \alpha_{i,0} \chi_0 \vert + \cdots + \vert \alpha_{i,i-1} \chi_{i-1} \vert + \vert \alpha_{i,i+1} \chi_{i+1} \vert + \cdots + \vert \alpha_{i,m-1} \chi_{m-1} \vert \\ ~~~=~~~~ \\ \vert \alpha_{i,0} \vert \vert \chi_0 \vert + \cdots + \vert \alpha_{i,i-1} \vert \vert \chi_{i-1} \vert + \vert \alpha_{i,i+1} \vert \vert \chi_{i+1} \vert + \cdots + \vert \alpha_{i,m-1} \vert \vert \chi_{m-1} \vert \\ ~~~ \leq ~~~~ \\ \vert \alpha_{i,0} \vert + \cdots + \vert \alpha_{i,i-1} \vert + \vert \alpha_{i,i+1} \vert + \cdots + \vert \alpha_{i,m-1} \vert \\ ~~~ \leq ~~~~ \\ \rho_i(A). \end{array} \end{equation*}

It is important to note that it is not necessarily the case that each such disk has exactly one eigenvalue in it. There is, however, a slightly stronger result than Theorem 9.2.1.4.

Corollary 9.2.1.5.

Let \(A \) and \(R_i(A) \) be as defined in Theorem 9.2.1.4. Let \(K \) and \(K^C \) be disjoint subsets of \(\{ 0, \ldots, m-1 \}\) such that \(K \cup K^C = \{ 0, \ldots, m-1 \} \text{.}\) In other words, let \(K \) and \(K^C \) partition \(\{ 0, \ldots , m-1 \} \text{.}\) If

\begin{equation*} \left( \cup_{k \in K} R_k(A) \right) \cap \left( \cup_{j \in K^C} R_j(A) \right) = \emptyset \end{equation*}

then \(\cup_{k \in K} R_k(A) \) contains exactly \(\vert K \vert \) eigenvalues of \(A \) (multiplicity counted). In other words, if \(\cup_{k \in K} R_k(A) \) does not intersect with any of the other disks, then it contains as many eigenvalues of \(A \) (multiplicity counted) as there are elements of \(K \text{.}\)

Proof.

The proof splits \(A = D + ( A - D ) \) where \(D\) equals the diagonal of \(A \) and considers \(A_\omega = D + \omega ( A - D ) \text{,}\) which varies continuously with \(\omega \text{.}\) One can argue that the disks \(R_i( A_0 ) \) start with only one eigenvalue each and only when they start intersecting can an eigenvalue "escape" the disk in which it started. We skip the details since we won't need this result in this course.

Through a few homeworks, let's review basic facts about eigenvalues and eigenvectors.

Homework 9.2.1.1.

Let \(A \in \mathbb C^{m \times m} \text{.}\)

TRUE/FALSE: \(0 \in \Lambda( A ) \) if and only \(A \) is singular.

Answer

TRUE

Now prove it!

Solution

\((\Rightarrow) \text{:}\) Assume \(0 \in \Lambda( A ) \text{.}\) Let \(x\) be an eigenvector associated with eigenvalue \(0 \text{.}\) Then \(A x = 0 x = 0 \text{.}\) Hence there exists a nonzero vector \(x \) such that \(A x = 0 \text{.}\) This implies \(A \) is singular.
\((\Leftarrow) \text{:}\) Assume \(A \) is singular. Then there exists \(x \neq 0 \) such that \(A x = 0 \text{.}\) Hence \(A x = 0 x \) and \(0 \) is an eigenvalue of \(A \text{.}\)

Homework 9.2.1.2.

Let \(A \in \mathbb C^{m \times m} \) be Hermitian.

ALWAYS/SOMETIMES/NEVER: All eigenvalues of \(A \) are real-valued.

Answer

ALWAYS

Now prove it!

Solution

Let \(( \lambda, x ) \) be an eigenpair of \(A \text{.}\) Then

\begin{equation*} A x = \lambda x \end{equation*}

and hence

\begin{equation*} x^H A x = \lambda x^H x . \end{equation*}

If we now conjugate both sides we find that

\begin{equation*} \overline{x^H A x} = \overline{\lambda x^H x } \end{equation*}

which is equivalent to

\begin{equation*} (x^H A x)^H = (\lambda x^H x )^H \end{equation*}

which is equivalent to

\begin{equation*} x^H A x = \overline \lambda x^H x \end{equation*}

since \(A \) is Hermitian. We conclude that

\begin{equation*} \overline \lambda = \frac{x^H A x}{x^H x} = \lambda \end{equation*}

(since \(x^H x \neq 0 \)).

Homework 9.2.1.3.

Let \(A \in \mathbb C^{m \times m} \) be Hermitian positive definite (HPD).

ALWAYS/SOMETIMES/NEVER: All eigenvalues of \(A \) are positive.

Answer

ALWAYS

Now prove it!

Solution

Let \(( \lambda, x ) \) be an eigenpair of \(A \text{.}\) Then

\begin{equation*} A x = \lambda x \end{equation*}

and hence

\begin{equation*} x^H A x = \lambda x^H x \end{equation*}

and finally (since \(x \neq 0 \))

\begin{equation*} \lambda = \frac{x^H A x }{x^H x}. \end{equation*}

Since \(A \) is HPD, both \(x^H A x \) and \(x^H x \) are positive, which means \(\lambda\) is positive.

The converse is also always true, but we are not ready to prove that yet.

Homework 9.2.1.4.

Let \(A \in \mathbb C^{m \times m} \) be Hermitian, \(( \lambda, x ) \) and \(( \mu, y ) \) be eigenpairs associated with \(A \text{,}\) and \(\lambda \neq \mu \text{.}\)

ALWAYS/SOMETIMES/NEVER: \(x^H y = 0\)

Answer

ALWAYS

Now prove it!

Solution

Since

\begin{equation*} A x = \lambda x \mbox{ and } A y = \mu y \end{equation*}

we know that

\begin{equation*} y^H A x = \lambda y^H x \mbox{ and } x^H A y = \mu x^H y \end{equation*}

and hence (remembering that the eigenvalues are real-valued)

\begin{equation*} \lambda y^H x = y^H A x = \overline{x^H A y } = \mu \overline{ x^H y } = \mu y^H x. \end{equation*}

We can rewrite this as

\begin{equation*} ( \lambda - \mu ) y^H x = 0. \end{equation*}

Since \(\lambda \neq \mu \) this implies that \(y^H x = 0 \) and hence \(x^H y = 0 \text{.}\)

Homework 9.2.1.5.

Let \(A \in \mathbb C^{m \times m} \text{,}\) \(( \lambda, x ) \) and \(( \mu, y ) \) be eigenpairs, and \(\lambda \neq \mu \text{.}\) Prove that \(x \) and \(y \) are linearly independent.

Solution

Proof by contradiction: Under the assumptions of the homework, we will show that assuming that \(x \) and \(y \) are linearly dependent leads to a contradiction.

If nonzero \(x \) and nonzero \(y \) are linearly dependent, then there exists \(\gamma \neq 0\) such that \(y = \gamma x \text{.}\) Then

\begin{equation*} A y = \mu y \end{equation*}

implies that

\begin{equation*} A ( \gamma x ) = \mu ( \gamma x ) \end{equation*}

and hence

\begin{equation*} \gamma \lambda x = \mu \gamma x. \end{equation*}

Rewriting this we get that

\begin{equation*} ( \lambda - \mu ) \gamma x = 0. \end{equation*}

Since \(\lambda \neq \mu \) and \(\gamma \neq 0\) this means that \(x = 0 \) which contradicts that \(x \) is an eigenvector.

We conclude that \(x \) and \(y \) are linearly independent.

We now generalize this insight.

Homework 9.2.1.6.

Let \(A \in \mathbb C^{m \times m} \text{,}\) \(k \leq m \text{,}\) and \(( \lambda_i, x_i ) \) for \(1 \leq i \lt k \) be eigenpairs of this matrix. Prove that if \(\lambda_i \neq \lambda_j \) when \(i \neq j \) then the eigenvectors \(x_i \) are linearly independent. In other words, given a set of distinct eigenvalues, a set of vectors created by taking one eigenvector per eigenvalue is linearly independent.

Hint

Prove by induction.

Solution

Proof by induction on \(k \text{.}\)

Base Case: \(k = 1 \text{.}\) This is trivially.
Assume the result holds for \(1 \leq k \lt m \text{.}\) Show it holds for \(k+1 \text{.}\)

The I.H. means that \(x_0, \ldots, x_{k-1} \) are linearly independent. We need to show that \(x_k \) is not a linear combination of \(x_0, \ldots , x_{k-1} \text{.}\) We will do so via a proof by contradiction.

Assume \(x_k \) is a linear combination of \(x_0, \ldots , x_{k-1} \) so that

\begin{equation*} x_k = \gamma_0 x_0 + \cdots + \gamma_{k-1} x_{k-1} \end{equation*}

with at least one \(\gamma_j \neq 0 \text{.}\) We know that \(A x_k = \lambda_k x_k\) and hence

\begin{equation*} A ( \gamma_0 x_0 + \cdots + \gamma_{k-1} x_{k-1} ) = \lambda_k ( \gamma_0 x_0 + \cdots + \gamma_{k-1} x_{k-1} ). \end{equation*}

Since \(A x_i = \lambda_i x_i \text{,}\) we conclude that

\begin{equation*} \gamma_0 \lambda_0 x_0 + \cdots + \gamma_{k-1} \lambda_{k-1} x_{k-1} = \gamma_0 \lambda_{xk-1} x_0 + \cdots + \gamma_{k-1} \lambda_k x_{k-1} \end{equation*}

or, equivalently,

\begin{equation*} \gamma_0 ( \lambda_0 - \lambda_k ) x_0 + \cdots + \gamma_{k-1} ( \lambda_{k-1} - \lambda_k ) x_{k-1} = 0. \end{equation*}

Since at least one \(\gamma_i \neq 0 \) and \(\lambda_i \neq \lambda_k \) for \(0 \leq i \lt k \text{,}\) we conclude that \(x_0, \ldots , x_{k-1} \) are linearly dependent, which is a contradiction.

Hence, \(x_0, \ldots , x_{k} \) are linearly independent.
By the Principle of Mathematical Induction, the result holds for \(1 \leq k \leq m \text{.}\)

We now return to some of the matrices we saw in Week 7.

Homework 9.2.1.7.

Consider the matrices

\begin{equation*} A = \left( \begin{array}{r r r r r r} 2 \amp -1 \amp \amp \amp \\ -1 \amp 2 \amp -1 \amp \amp \\ \amp \ddots \amp \ddots \amp \ddots \amp \\ \amp \amp -1 \amp 2 \amp -1 \\ \amp \amp \amp -1 \amp 2 \\ \end{array} \right) \end{equation*}

and

\begin{equation*} \left( \begin{array} {r r r r | r r r r | r r } 4 \amp -1 \amp \amp \amp -1 \amp \amp \amp \amp \amp \\ -1 \amp 4 \amp -1 \amp \amp \amp -1 \amp \amp \amp \amp \\ \amp -1 \amp 4 \amp -1 \amp \amp \amp -1 \amp \amp \amp \\ \amp \amp -1 \amp 4 \amp \amp \amp \amp -1 \amp \amp \\ \hline -1 \amp \amp \amp \amp 4 \amp -1 \amp \amp \amp -1 \amp \\ \amp -1 \amp \amp \amp -1 \amp 4 \amp -1 \amp \amp \amp \ddots \\ \amp \amp -1 \amp \amp \amp -1 \amp 4 \amp -1 \amp \amp \\ \amp \amp \amp -1 \amp \amp \amp -1 \amp 4 \amp \amp \\ \hline \amp \amp \amp \amp -1 \amp \amp \amp \amp 4 \amp \ddots \\ \amp \amp \amp \amp \amp \amp \ddots \amp \amp \ddots \amp \ddots \end{array} \right) \end{equation*}

ALWAYS/SOMETIMES/NEVER: All eigenvalues of these matrices are nonnegative.

ALWAYS/SOMETIMES/NEVER: All eigenvalues of the first matrix are positive.

Answer

ALWAYS: All eigenvalues of these matrices are nonnegative.

ALWAYS: All eigenvalues of the first matrix are positive. (So are all the eigenvalues of the second matrix, but proving that is a bit trickier.)

Now prove it!

Solution

For the first matrix, we can use the Gershgorin disk theorem to conclude that all eigenvalues of the matrix lie in the set \(\{ x \mbox{ s.t. } \vert x - 2 \vert \leq 2 \text{.}\) We also notice that the matrix is symmetric, which means that its eigenvalues are real-valued. Hence the eigenvalues are nonnegative. A similar argument can be used for the second matrix.

Now, in Homework 7.2.1.1 we showed that the first matrix is nonsingular. Hence, it cannot have an eigenvalue equal to zero. We conclude that its eigenvalues are all positive.

It can be shown that the second matrix is also nonsingular, and hence has positive eigenvalues. However, that is a bit nasty to prove...