ALAFF Submultiplicative norms

Subsection 1.3.8 Submultiplicative norms

There are a number of properties that we would like for a matrix norm to have (but not all norms do have). Recalling that we would like for a matrix norm to measure by how much a vector is "stretched," it would be good if for a given matrix norm, \(\| \cdots \|: \mathbb C^{m \times n} \rightarrow \mathbb R \text{,}\) there are vector norms \(\| \cdot \|_\mu: \Cm \rightarrow \mathbb R \) and \(\| \cdot \|_\nu: \Cn \rightarrow \mathbb R \) such that, for arbitrary nonzero \(x \in \Cn \text{,}\) the matrix norm bounds by how much the vector is stretched:

\begin{equation*} \frac{\| A x \|_\mu}{ \| x \|_\nu} \leq \| A \| \end{equation*}

or, equivalently,

\begin{equation*} \| A x \|_\mu\leq \| A \| \| x \|_\nu \end{equation*}

where this second formulation has the benefit that it also holds if \(x = 0 \text{.}\) When this relationship between the involved norms holds, the matrix norm is said to be subordinate to the vector norms:

Definition 1.3.8.1. Subordinate matrix norm.

A matrix norm \(\| \cdot \|: \C^{m \times n} \rightarrow \mathbb R \) is said to be subordinate to vector norms \(\| \cdot \|_\mu: \Cm \rightarrow \mathbb R \) and \(\| \cdot \|_\nu: \Cn \rightarrow \mathbb R \) if, for all \(x \in \Cn \text{,}\)

\begin{equation*} \| A x \|_\mu \leq \| A \| \| x \|_\nu . \end{equation*}

Fortunately, all the norms that we will employ in this course are subordinate matrix norms.

Homework 1.3.8.1.

ALWAYS/SOMETIMES/NEVER: The matrix 2-norm is subordinate to the vector 2-norm.

ALWAYS/SOMETIMES/NEVER: The Frobenius norm is subordinate to the vector 2-norm.

Answer

Both are ALWAYS true.

Now prove it.

Solution

We first prove that the matrix 2-norm is subordinate to the vector 2-norm.

W.l.o.g., assume \(x \neq 0 \text{.}\) (Why? Because if \(x = 0 \) than obviously \(\| A x \|_\mu \leq \| A \|_{\mu,\nu} \| x \|_\nu \text{.}\))

\begin{equation*} \begin{array}{l} \| A x \|_2 \\ ~~~ = ~~~~ \lt \mbox{ this is the trick... } \gt \\ \frac{\| A x \|_2} {\| x \|_2} \| x \|_2 \\ ~~~ \leq ~~~~ \lt \mbox{ if } f( z ) \mbox{ is nonnegative for all } z \neq 0 \mbox{ then } f( x ) \leq \max_{y\neq 0} f(y ) \gt \\ \max_{y \neq 0} \frac{ \| A y \|_2 }{ \| y \|_2} \| x \|_2 \\ ~~~=~~~~ \lt \mbox{ definition of the 2-norm } \gt \\ = \| A \|_2 \| x \|_2 . \end{array} \end{equation*}

The second result follows from the fact that \(\| A \|_2 \leq \| A \|_F \text{.}\) We showed that in Homework 1.3.7.2.

Theorem 1.3.8.2.

Induced matrix norms, \(\| \cdot \|_{\mu,\nu}: \mathbb C^{m \times n} \rightarrow \mathbb R \text{,}\) are subordinate to the norms, \(\| \cdot \|_\mu \) and \(\| \cdot \|_\nu \text{,}\) that induce them.

Proof.

W.l.o.g. assume \(x \neq 0 \text{.}\) Then

\begin{equation*} \| A x \|_{\mu} = \frac{\| A x \|_\mu}{\| x \|_\nu} \| x \|_\nu \leq \max_{y\neq 0} \frac{\| A y \|_\mu}{\| y \|_\nu} \| x \|_\nu = \| A \|_{\mu,\nu} \| x \|_\nu. \end{equation*}

Corollary 1.3.8.3.

Any matrix \(p\)-norm is subordinate to the corresponding vector \(p\)-norm.

Another desirable property that not all norms have is that

\begin{equation*} \| A B \| \leq \| A \| \| B \|. \end{equation*}

This requires the given norm to be defined for all matrix sizes.

Definition 1.3.8.4. Consistent matrix norm.

A matrix norm \(\| \cdot \|: \C^{m \times n} \rightarrow \mathbb R \) is said to be a consistent matrix norm if it is defined for all \(m \) and \(n \text{,}\) using the same formula for all \(m \) and \(n \text{.}\)

Obviously, this definition is a bit vague. Fortunately, it is pretty clear that all the matrix norms we will use in this course, the Frobenius norm and the \(p\)-norms, are all consistently defined for all matrix sizes.

Definition 1.3.8.5. Submultiplicative matrix norm.

A consistent matrix norm \(\| \cdot \|: \C^{m \times n} \rightarrow \mathbb R \) is said to be submultiplicative if it satisfies

\begin{equation*} \| A B \| \leq \| A \| \| B \| . \end{equation*}

Theorem 1.3.8.6.

Let \(\| \cdot \|: \C^n \rightarrow \mathbb R \) be a vector norm defined for all \(n \text{.}\) Define the corresponding induced matrix norm as

\begin{equation*} \| A \| = \max_{ x \neq 0 } \frac{\| A x \|} {\| x \|} = \max_{\| x \|=1} \| A x \|. \end{equation*}

Then for any \(A \in \C^{m \times k} \) and \(B \in \C^{k \times n} \) the inequality \(\| A B \| \leq \| A \| \| B \| \) holds.

In other words, induced matrix norms are submultiplicative. To prove this theorem, it helps to first prove a simpler result:

Lemma 1.3.8.7.

Let \(\| \cdot \|: \C^n \rightarrow \mathbb R \) be a vector norm defined for all \(n \) and let \(\| \cdot \|: \mathbb C^{m \times n} \rightarrow \mathbb R \) be the matrix norm it induces. Then \(\| A x \| \leq \| A \| \| x \| \text{..}\)

Proof.

If \(x = 0 \text{,}\) the result obviously holds since then \(\| A x \| = 0 \) and \(\| x \| = 0 \text{.}\) Let \(x \neq 0 \text{.}\) Then

\begin{equation*} \| A \| = \max_{x \neq 0} \frac{\| A x \|} {\| x \|} \geq \frac{\| A x \|} {\| x \|}. \end{equation*}

Rearranging this yields \(\| A x \| \leq \| A \| \| x \| \text{.}\)

We can now prove the theorem:

Proof.

\begin{equation*} \begin{array}{l} \| A B \| \\ ~~~ = ~~~~ \lt \mbox{ definition of induced matrix norm } \gt\\ \max_{\| x \| = 1} \| A B x \| \\ ~~~ = ~~~~ \lt \mbox{ associativity } \gt \\ \max_{\| x \|=1} \| A ( B x ) \| \\ ~~~ \leq ~~~~ \lt \mbox{ lemma } \gt \\ \max_{\| x \|=1} ( \| A \| \| B x \| ) \\ ~~~ \leq ~~~~ \lt \mbox{ lemma } \gt \\ \max_{\| x \|=1} ( \| A \| \| B \| \| x \| ) \\ ~~~ = ~~~~ \lt \| x \| = 1 \gt \\ \| A \| \| B \| . \end{array} \end{equation*}

Homework 1.3.8.2.

Show that \(\| A x \|_\mu \leq \| A \|_{\mu,\nu} \| x \|_\nu \) for all \(x \text{.}\)

Solution

W.l.o.g. assume that \(x \ne 0 \text{.}\) (Why? Because if \(x = 0 \) than obviously \(\| A x \|_\mu \leq \| A \|_{\mu,\nu} \| x \|_\nu \text{.}\))

\begin{equation*} \| A \|_{\mu,\nu} = \max_{y \neq 0} \frac{\| A y \|_\mu}{\| y \|_\nu} \geq \frac{\| A x \|_\mu}{\| x \|_\nu} . \end{equation*}

Rearranging this establishes the result.

Homework 1.3.8.3.

Show that \(\| A B \|_\mu \leq \| A \|_{\mu,\nu} \| B \|_{\nu,\mu} \text{.}\)

Solution

\begin{equation*} \begin{array}{l} \| A B \|_{\mu} \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ \max_{\| x \|_\mu = 1} \| A B x \|_\mu \\ ~~~\leq~~~~ \lt \mbox{ last homework } \gt \\ \max_{\| x \|_\mu = 1} \| A \|_{\mu,\nu} \| B x \|_\nu \\ ~~~=~~~~ \lt \mbox{ algebra } \gt \\ \| A \|_{\mu,\nu} \max_{\| x \|_\mu = 1} \| B x \|_\nu \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ \| A \|_{\mu,\nu} \| B \|_{\nu,\mu} \end{array} \end{equation*}

Homework 1.3.8.4.

Show that the Frobenius norm, \(\| \cdot \|_F \text{,}\) is submultiplicative.

Solution

\begin{equation*} \begin{array}{l} \| A B \|_F^2 \\ ~~~ = ~~~~ \lt \mbox{ partition } \gt \\ \left\| \left( \begin{array}{c} \widetilde a_0^T \\ \widetilde a_1^T \\ \vdots \\ \widetilde a_{m-1}^T \end{array} \right) \left( \begin{array}{c | c | c | c} b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) \right\|_F^2 \\ ~~~ = ~~~~ \lt \mbox{ partitioned matrix-matrix multiplication } \gt \\ \left\| \left( \begin{array}{c | c | c | c} \widetilde a_0^T b_0 \amp \widetilde a_0^T b_1 \amp \cdots \amp \widetilde a_0^T b_{n-1} \\ \hline \widetilde a_1^T b_0 \amp \widetilde a_1^T b_1 \amp \cdots \amp \widetilde a_1^T b_{n-1} \\ \hline \vdots \amp \vdots \amp \amp \vdots \\ \hline \widetilde a_{m-1}^T b_0 \amp \widetilde a_{m-1}^T b_1 \amp \cdots \amp \widetilde a_{m-1}^T b_{n-1} \end{array} \right) \right\|_F^2 \\ ~~~ = ~~~~ \lt \mbox{ definition of Frobenius norm } \gt \\ \sum_i \sum_j \vert \widetilde a_i^T b_j \vert^2 \\ ~~~ = ~~~~ \lt \mbox{ definition of Hermitian transpose vs transpose } \gt \\ \sum_i \sum_j \vert \overline{ \widetilde a_i}^H b_j \vert^2 \\ ~~~ \leq ~~~~ \lt \mbox{ Cauchy-Schwarz inequality } \gt \\ \sum_i \sum_j \| \overline {\widetilde a_i} \|_2^2 \| b_j \|_2^2 \\ ~~~ = ~~~~ \lt \mbox{ algebra and } \| \overline x \|_2 = \| x \|_2 \gt \\ \left( \sum_i \| \widetilde a_i \|_2^2 \right) \left( \sum_j \| b_j \|_2^2 \right) \\ ~~~ = ~~~~ \lt \mbox{ previous observations about the Frobenius norm } \gt \\ \| A \|_F^2 \| B \|_F^2 \end{array} \end{equation*}

Hence \(\| A B \|_F^2 \leq \| A \|_F^2 \| B \|_F^2 \text{.}\) Taking the square root of both sides leaves us with \(\| A B \|_F \leq \| A \|_F \| B \|_F \text{.}\)

This proof brings to the forefront that the notation \(\widetilde a_i^T \) leads to some possible confusion. In this particular situation, it is best to think of \(\widetilde a_i \) as a vector that, when transposed, becomes the row of \(A \) indexed with \(i \text{.}\) In this case, \(\widetilde a_i^T = \overline{\widetilde a_i}^H \) and \((\widetilde a_i^T)^H = \overline {\widetilde a_i} \) (where, recall, \(\overline x \) equals the vector with all its entries conjugated). Perhaps it is best to just work through this problem for the case where \(A \) and \(B \) are real-valued, and not worry too much about the details related to the complex-valued case...

Homework 1.3.8.5.

For \(A \in \C^{m \times n} \) define

\begin{equation*} \| A \| = \max_{i=0}^{m-1} \max_{j=0}^{n-1} \vert \alpha_{i,j} \vert. \end{equation*}

TRUE/FALSE: This is a norm.
TRUE/FALSE: This is a consistent norm.

Answer

TRUE
TRUE

Solution

This is a norm. You can prove this by checking the three conditions.
It is a consistent norm since it is defined for all \(m\) and \(n \text{.}\)

Remark 1.3.8.8.

The important take-away: The norms we tend to use in this course, the \(p\)-norms and the Frobenius norm, are all submultiplicative.

Homework 1.3.8.6.

Let \(A \in \mathbb C^{m \times n} \text{.}\)

ALWAYS/SOMETIMES/NEVER: There exists a vector

\begin{equation*} x = \left( \begin{array}{c} \chi_0 \\ \vdots \\ \chi_{n-1} \end{array} \right) \mbox{ with } \vert \chi_i \vert = 1 \mbox{ for } i = 0, \ldots, n-1 \end{equation*}

such that \(\| A \|_\infty = \| A x \|_\infty \text{.}\)

Answer

ALWAYS

Now prove it!

Solution

Partition \(A \) by rows:

\begin{equation*} A = \left( \begin{array}{c} \widetilde a_0^T \\ \vdots \\ \widetilde a_{m-1}^T \end{array} \right). \end{equation*}

We know that there exists \(k \) such that \(\| \widetilde a_k \|_1 = \| A \|_\infty \text{.}\) Now

\begin{equation*} \begin{array}{l} \| \widetilde a_k \|_1 \\ ~~~~= ~~~ \lt \mbox{ definition of 1-norm } \gt \\ \vert \alpha_{k,0} \vert + \cdots + \vert \alpha_{k,n-1} \vert \\ ~~~~= ~~~ \lt \mbox{ algebra } \gt \\ \frac{ \vert \alpha_{k,0} \vert }{ \alpha_{k,0} } \alpha_{k,0} + \cdots + \frac{ \vert \alpha_{k,n-1} \vert }{ \alpha_{k,n-1} } \alpha_{k,n-1}. \end{array} \end{equation*}

where we take \(\frac{ \vert \alpha_{k,j} \vert }{ \alpha_{k,j} } = 1 \) whenever \(\alpha_{k,j} = 0 \text{.}\) Vector

\begin{equation*} x = \left( \begin{array}{c} \frac{ \vert \alpha_{k,0} \vert }{ \alpha_{k,0} } \\ \vdots \\ \frac{ \vert \alpha_{k,n-1} \vert }{ \alpha_{k,n-1} } \end{array} \right) \end{equation*}

has the desired property.