ALAFF The implicit Q theorem

Subsection 10.3.4 The implicit Q theorem

Definition 10.3.4.1. Upper Hessenberg matrix.

A matrix is said to be upper Hessenberg if all entries below its first subdiagonal equal zero.

In other words, an \(m \times m \) upper Hessenberg matrix looks like

\begin{equation*} A = \left( \begin{array}{c c c c c c} \alpha_{0,0} \amp \alpha_{0,1} \amp \alpha_{0,2} \amp \cdots \amp \alpha_{0,m-1} \amp \alpha_{0,m-1} \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \alpha_{1,2} \amp \cdots \amp \alpha_{1,m-1} \amp \alpha_{1,m-1} \\ 0 \amp \alpha_{2,1} \amp \alpha_{2,2} \amp \cdots \amp \alpha_{2,m-1} \amp \alpha_{2,m-1} \\ \vdots \amp \ddots \amp \ddots \amp \ddots \amp \vdots \amp \vdots \\ 0 \amp 0 \amp 0 \amp \ddots \amp \alpha_{m-2,m-2} \amp \alpha_{m-2,m-1} \\ 0 \amp 0 \amp 0 \amp \cdots \amp \alpha_{m-1,m-2} \amp \alpha_{m-1,m-1} \end{array} \right). \end{equation*}

Obviously, a tridiagonal matrix is a special case of an upper Hessenberg matrix.

The following theorem sets the stage for one of the most remarkable algorithms in numerical linear algebra, which allows us to greatly streamline the implementation of the shifted QR algorithm.

Theorem 10.3.4.2. Implicit Q Theorem.

Let \(A, B \in \Cmxm \text{,}\) where \(B \) is upper Hessenberg and has only (real) positive elements on its first subdiagonal. Assume there exists a unitary matrix \(Q \) such that \(Q^H A Q = B \text{.}\) Then \(Q \) and \(B \) are uniquely determined by \(A \) and the first column of \(Q \text{.}\)

Proof.

Partition

\begin{equation*} Q = \left( \begin{array}{c | c | c | c | c | c } q_0 \amp q_1 \amp q_2 \amp \cdots \amp q_{m-2} \amp q_{m-1} \end{array} \right) \end{equation*}

and

\begin{equation*} B = \left( \begin{array}{c | c | c | c | c | c} \beta_{0,0} \amp \beta_{0,1} \amp \beta_{0,2} \amp \cdots \amp \beta_{0,m-2} \amp \beta_{0,m-1} \\ \hline \beta_{1,0} \amp \beta_{1,1} \amp \beta_{1,2} \amp \cdots \amp \beta_{1,m-2} \amp \beta_{1,m-1} \\ \hline 0 \amp \beta_{2,1} \amp \beta_{2,2} \amp \cdots \amp \beta_{2,m-2} \amp\beta_{2,m-1} \\ \hline 0 \amp 0 \amp \beta_{3,2} \amp \cdots \amp \beta_{3,m-2} \amp \beta_{3,m-1} \\ \hline \vdots \amp \vdots \amp \vdots \amp \ddots \amp \vdots \amp \vdots \\ \hline 0 \amp 0 \amp 0 \amp \cdots \amp \beta_{m-1,m-2} \amp \beta_{m-1,m-1} \end{array} \right). \end{equation*}

Notice that \(A Q = Q B \) and hence

\begin{equation*} \begin{array}{l} A \left( \begin{array}{c | c | c | c | c | c} q_0 \amp q_1 \amp q_2 \amp \cdots \amp q_{m-2} \amp q_{m-1} \end{array} \right) \\ ~~~ = \left( \begin{array}{c | c | c | c | c | c} q_0 \amp q_1 \amp q_2 \amp \cdots \amp q_{m-2} \amp q_{m-1} \end{array} \right) \\ ~~~~~~~~~~~~~~~~~~~~~ \left( \begin{array}{c | c | c | c | c | c} \beta_{0,0} \amp \beta_{0,1} \amp \beta_{0,2} \amp \cdots \amp \beta_{0,m-2} \amp \beta_{0,m-1} \\ \hline \beta_{1,0} \amp \beta_{1,1} \amp \beta_{1,2} \amp \cdots \amp \beta_{1,m-2} \amp \beta_{1,m-1} \\ \hline 0 \amp \beta_{2,1} \amp \beta_{2,2} \amp \cdots \amp \beta_{2,m-1} \\ \hline 0 \amp 0 \amp \beta_{3,2} \amp \cdots \amp \beta_{3,m-2} \amp \beta_{3,m-1} \\ \hline \vdots \amp \vdots \amp \vdots \amp \ddots \amp \vdots \amp \vdots \\ \hline 0 \amp 0 \amp 0 \amp \cdots \amp \beta_{m-1,m-2} \amp \beta_{m-1,m-1} \end{array} \right). \end{array} \end{equation*}

Equating the first column on the left and right, we notice that

\begin{equation*} A q_0 = \beta_{0,0} q_0 + \beta_{1,0} q_1. \end{equation*}

Now, \(q_0 \) is given and \(\| q_0 \|_2 =1 \) since \(Q \) is unitary. Hence

\begin{equation*} q_0^H A q_0 = \beta_{0,0} q_0^H q_0 + \beta_{1,0} q_0^H q_1 = \beta_{0,0}. \end{equation*}

Next,

\begin{equation*} \beta_{1,0} q_1 = A q_0 - \beta_{0,0} q_0 = \tilde q_1. \end{equation*}

Since \(\| q_1 \|_2 = 1 \) (it is a column of a unitary matrix) and \(\beta_{1,0} \) is assumed to be positive, then we know that

\begin{equation*} \beta_{1,0} = \| \tilde q_1 \|_2. \end{equation*}

Finally,

\begin{equation*} q_1 = \tilde q_1 / \beta_{1,0} . \end{equation*}

The point is that the first column of \(B \) and second column of \(Q \) are prescribed by the first column of \(Q \) and the fact that \(B \) has positive elements on the first subdiagonal. In this way, it can be successively argued that, one by one, each column of \(Q \) and each column of \(B \) are prescribed.

Homework 10.3.4.1.

Give all the details of the above proof.

Solution

Assume that \(q_1, \ldots , q_k \) and the column indexed with \(k-1 \) of \(B\) have been shown to be uniquely determined under the stated assumptions. We now show that then \(q_{k+1} \) and the column indexed by \(k \) of \(B \) are uniquely determined. (This is the inductive step in the proof.) Then

\begin{equation*} A q_k = \beta_{0,k} q_0 + \beta_{1,k} q_1 + \cdots + \beta_{k,k} q_k +\beta_{k+1,k} q_{k+1} . \end{equation*}

We can determine \(\beta_{0,k} \) through \(\beta_{k,k} \) by observing that

\begin{equation*} q_j^H A q_k = \beta_{j,k} \end{equation*}

for \(j = 0, \ldots, k \text{.}\) Then

\begin{equation*} \beta_{k+1,k} q_{k+1} = A q_k - ( \beta_{0,k} q_0 + \beta_{1,k} q_1 + \cdots + \beta_{k,k} q_k ) = \tilde q_{k+1} . \end{equation*}

Since it is assumed that \(\beta_{k+1,k} > 0 \text{,}\) it can be determined as

\begin{equation*} \beta_{k+1,k} = \| \tilde q_{k+1} \|_2 \end{equation*}

and then

\begin{equation*} q_{k+1} = \tilde q_{k+1} / \beta_{k+1,k}. \end{equation*}

This way, the columns of \(Q \) and \(B \) can be determined, one-by-one.

Ponder This 10.3.4.2.

Notice the similarity between the above proof and the proof of the existence and uniqueness of the QR factorization!

This can be brought out by observing that

\begin{equation*} \begin{array}{l} \left( \begin{array}{c | c} q_0 \amp A \end{array} \right) \left( \begin{array}{c | c} 1 \amp 0 \\ \hline 0 \amp \left( \begin{array}{c | c | c | c | c | c} q_0 \amp q_1 \amp q_2 \amp \cdots \amp q_{m-2} \amp q_{m-1} \end{array} \right) \end{array} \right) \\ ~~~ = \left( \begin{array}{c | c | c | c | c | c} q_0 \amp q_1 \amp q_2 \amp \cdots \amp q_{m-2} \amp q_{m-1} \end{array} \right) \left( \begin{array}{c | c | c | c | c | c | c} 1 \amp \beta_{0,0} \amp \beta_{0,1} \amp \beta_{0,2} \amp \cdots \amp \beta_{0,m-2} \amp \beta_{0,m-1} \\ \hline 0 \amp \beta_{1,0} \amp \beta_{1,1} \amp \beta_{1,2} \amp \cdots \amp \beta_{1,m-2} \amp \beta_{1,m-1} \\ \hline 0 \amp 0 \amp \beta_{2,1} \amp \beta_{2,2} \amp \cdots \amp \beta_{2,m-1} \\ \hline 0 \amp 0 \amp 0 \amp \beta_{3,2} \amp \cdots \amp \beta_{3,m-2} \amp \beta_{3,m-1} \\ \hline \vdots \amp \vdots \amp \vdots \amp \vdots \amp \ddots \amp \vdots \amp \vdots \\ \hline 0 \amp 0 \amp 0 \amp 0 \amp \cdots \amp \beta_{m-1,m-2} \amp \beta_{m-1,m-1} \end{array} \right). \end{array} \end{equation*}

Puzzle through this observation and interpret what it means.

Remark 10.3.4.3.

In our case, \(A \) is symmetric tridiagonal, and so is \(B \text{.}\)