## Subsection1.3.3The Frobenius norm

###### Definition1.3.3.1. The Frobenius norm.

The Frobenius norm $\| \cdot \|_F : \C^{m \times n} \rightarrow \mathbb R$ is defined for $A \in \C^{m \times n}$ by

\begin{equation*} \| A \|_F = \sqrt{ \sum_{i=0}^{m-1} \sum_{j=0}^{n-1} \vert \alpha_{i,j} \vert^2 } = \sqrt{ \begin{array}{c c c c c c} \vert \alpha_{0,0} \vert^2 \amp + \amp \cdots \amp + \amp \vert \alpha_{0,n-1} \vert^2 \amp + \\ \vdots \amp \vdots \amp \amp \vdots \amp \vdots \amp \vdots \\ \vert \alpha_{m-1,0} \vert^2 \amp + \amp \cdots \amp + \amp \vert \alpha_{m-1,n-1} \vert^2 . \end{array} } \end{equation*}

One can think of the Frobenius norm as taking the columns of the matrix, stacking them on top of each other to create a vector of size $m \times n \text{,}$ and then taking the vector 2-norm of the result.

###### Homework1.3.3.1.

Partition $m \times n$ matrix $A$ by columns:

\begin{equation*} A = \left( \begin{array}{c | c | c} a_0 \amp \cdots \amp a_{n-1} \end{array} \right). \end{equation*}

Show that

\begin{equation*} \| A \|_F^2 = \sum_{j=0}^{n-1} \| a_j \|_2^2. \end{equation*}
Solution
\begin{equation*} \begin{array}{l} \| A \|_F \\ ~~~=~~~~ \lt \mbox{ definition} \gt \\ \sqrt{ \sum_{i=0}^{m-1} \sum_{j=0}^{n-1} \vert \alpha_{i,j} \vert^2 } \\ ~~~=~~~~ \lt \mbox{ commutativity of addition} \gt \\ \sqrt{ \sum_{j=0}^{n-1} \sum_{i=0}^{m-1} \vert \alpha_{i,j} \vert^2 } \\ ~~~=~~~~ \lt \mbox{ definition of vector 2-norm } \gt \\ \sqrt{ \sum_{j=0}^{n-1} \| a_j \|_2^2 } \end{array} \end{equation*}
###### Homework1.3.3.2.

Prove that the Frobenius norm is a norm.

Solution

Establishing that this function is positive definite and homogeneous is straight forward. To show that the triangle inequality holds it helps to realize that if $A = \left( \begin{array}{c | c | c | c} a_0 \amp a_1 \amp \cdots \amp a_{n-1} \end{array} \right)$ then

\begin{equation*} \begin{array}{l} \| A \|_F \\ ~~~=~~~~ \lt \mbox{ definition} \gt \\ \sqrt{ \sum_{i=0}^{m-1} \sum_{j=0}^{n-1} \vert \alpha_{i,j} \vert^2 } \\ ~~~=~~~~ \lt \mbox{ commutativity of addition} \gt \\ \sqrt{ \sum_{j=0}^{n-1} \sum_{i=0}^{m-1} \vert \alpha_{i,j} \vert^2 } \\ ~~~=~~~~ \lt \mbox{ definition of vector 2-norm } \gt \\ \sqrt{ \sum_{j=0}^{n-1} \| a_j \|_2^2 } \\ ~~~=~~~~ \lt \mbox{ definition of vector 2-norm } \gt \\ \sqrt{ \left\| \left( \begin{array}{c} a_0 \\ a_1 \\ \vdots \\ a_{n-1} \end{array} \right) \right\|_2^2 }. \end{array} \end{equation*}

In other words, it equals the vector 2-norm of the vector that is created by stacking the columns of $A$ on top of each other. One can then exploit the fact that the vector 2-norm obeys the triangle inequality.

###### Homework1.3.3.3.

Partition $m \times n$ matrix $A$ by rows:

\begin{equation*} A = \left( \begin{array}{c} \widetilde a_0^T \\ \hline \vdots \\ \hline \widetilde a_{m-1}^T \end{array} \right). \end{equation*}

Show that

\begin{equation*} \| A \|_F^2 = \sum_{i=0}^{m-1} \| \widetilde a_i \|_2^2, \end{equation*}

where $\widetilde a_i = {\widetilde a_i^T~}^T \text{.}$

Solution
\begin{equation*} \begin{array}{l} \| A \|_F \\ ~~~=~~~~ \lt \mbox{ definition} \gt \\ \sqrt{ \sum_{i=0}^{m-1} \sum_{j=0}^{n-1} \vert \alpha_{i,j} \vert^2 } \\ ~~~=~~~~ \lt \mbox{ definition of vector 2-norm } \gt \\ \sqrt{ \sum_{i=0}^{m-1} \| \widetilde a_i \|_2^2. } \end{array} \end{equation*}

Let us review the definition of the transpose of a matrix (which we have already used when defining the dot product of two real-valued vectors and when identifying a row in a matrix):

###### Definition1.3.3.2. Transpose.

If $A \in \mathbb C^{m \times n}$ and

\begin{equation*} A = \left( \begin{array}{c| c | c | c} \alpha_{0,0} \amp \alpha_{0,1} \amp \cdots \amp \alpha_{0,n-1} \\ \hline \alpha_{1,0} \amp \alpha_{1,1} \amp \cdots \amp \alpha_{1,n-1} \\ \hline \vdots \amp \vdots \amp \amp \vdots \\ \vdots \\ \hline \alpha_{m-1,0} \amp \alpha_{m-1,1} \amp \cdots \amp \alpha_{m-1,n-1} \ \end{array} \right) \end{equation*}

then its transpose is defined by

\begin{equation*} A^T = \left( \begin{array}{c| c | c | c} \alpha_{0,0} \amp \alpha_{1,0} \amp \cdots \amp \alpha_{m-1,0} \\ \hline \alpha_{0, 1} \amp \alpha_{1, 1} \amp \cdots \amp \alpha_{m-1, 1} \\ \hline \vdots \amp \vdots \amp \amp \vdots \\ \vdots \\ \hline \alpha_{0, n-1} \amp \alpha_{1, n-1} \amp \cdots \amp \alpha_{m-1, n-1} \ \end{array} \right) . \end{equation*}

For complex-valued matrices, it is important to also define the Hermitian transpose of a matrix:

###### Definition1.3.3.3. Hermitian transpose.

If $A \in \mathbb C^{m \times n}$ and

\begin{equation*} A = \left( \begin{array}{c| c | c | c} \alpha_{0,0} \amp \alpha_{0,1} \amp \cdots \amp \alpha_{0,n-1} \\ \hline \alpha_{1,0} \amp \alpha_{1,1} \amp \cdots \amp \alpha_{1,n-1} \\ \hline \vdots \amp \vdots \amp \amp \vdots \\ \vdots \\ \hline \alpha_{m-1,0} \amp \alpha_{m-1,1} \amp \cdots \amp \alpha_{m-1,n-1} \ \end{array} \right) \end{equation*}

then its Hermitian transpose is defined by

\begin{equation*} A^H = \overline A^T \left( \begin{array}{c| c | c | c} \overline \alpha_{0,0} \amp \overline \alpha_{1,0} \amp \cdots \amp \overline \alpha_{m-1,0} \\ \hline \overline \alpha_{0, 1} \amp \overline \alpha_{1, 1} \amp \cdots \amp \overline \alpha_{m-1, 1} \\ \hline \vdots \amp \vdots \amp \amp \vdots \\ \vdots \\ \hline \overline \alpha_{0, n-1} \amp \overline \alpha_{1, n-1} \amp \cdots \amp \overline \alpha_{m-1, n-1} \ \end{array} \right), \end{equation*}

where $\overline A$ denotes the conjugate of a matrix, in which each element of the matrix is conjugated.

We note that

• $\overline A^T = \overline{ A^T } \text{.}$

• If $A \in \mathbb R^{m \times n} \text{,}$ then $A^H = A^T \text{.}$

• If $x \in \Cm \text{,}$ then $x^H$ is defined consistent with how we have used it before.

• If $\alpha \in \mathbb C \text{,}$ then $\alpha^H = \overline \alpha \text{.}$

(If you view the scalar as a matrix and then Hermitian transpose it, you get the matrix with as only element $\overline \alpha \text{.}$)

Don't Panic!. While working with complex-valued scalars, vectors, and matrices may appear a bit scary at first, you will soon notice that it is not really much more complicated than working with their real-valued counterparts.

###### Homework1.3.3.4.

Let $A \in \C^{m \times k}$ and $B \in \C^{k \times n} \text{.}$ Using what you once learned about matrix transposition and matrix-matrix multiplication, reason that $(A B)^H = B^H A^H \text{.}$

Solution
\begin{equation*} \begin{array}{l} ( A B )^H \\ ~~~ = ~~~~ \lt X^H = \overline{X^T} \gt \\ \overline{ (A B)^T } \\ ~~~ = ~~~~ \lt \mbox{ you once discovered that } ( A B )^T = B^T A^T \gt \\ \overline{ B^T A^T } \\ ~~~ = ~~~~ \lt \mbox{ you may check separately that } \overline{ XY} = \overline X \overline Y \gt \\ \overline{ B^T } ~~ \overline{ A^T } \\ ~~~ = ~~~~ \lt \overline {X^T} = \overline X^T \gt \\ B^H A^H \end{array} \end{equation*}
###### Definition1.3.3.4. Hermitian.

A matrix $A \in \mathbb C^{m \times m}$ is Hermitian if and only if $A = A^H \text{.}$

Obviously, if $A \in \mathbb R^{m \times m} \text{,}$ then $A$ is a Hermitian matrix if and only if $A$ is a symmetric matrix.

###### Homework1.3.3.5.

Let $A \in \mathbb C^{m \times n} \text{.}$

ALWAYS/SOMETIMES/NEVER: $\| A^H \|_F = \| A \|_F\text{.}$

The Frobenius norm of a $m \times n$ matrix is easy to compute (requiring $O( m n )$ computations). The functions $f( A ) = \| A \|_F$ and $f( A ) = \| A \|_F^2$ are also differentiable. However, you'd be hard-pressed to find a meaningful way of linking the definition of the Frobenius norm to a measure of an underlying linear transformation (other than by first transforming that linear transformation into a matrix).