Refs:
-
Main refs. Wiki pages: Min-max theorem, Weyl’s inequality.
-
“Inference and Learning from Data” by Sayed.
-
Qiaochu Yuan’s blogpost: Link.
-
“Matrix Analysis and Applied Linear Algebra” by Meyer.
-
“Analysis and Linear Algebra” by Bisgard.
[Courant-Fischer Theorem]
Let ${ A \in \mathbb{C} ^{n \times n} }$ be a Hermitian matrix.
Note that by the spectral theorem,
\[{ A U = U \text{diag}(\lambda _1, \ldots, \lambda _n) }\]for some orthonormal matrix ${ U \in \mathbb{C} ^{n \times n} }$ and reals ${ \lambda _1 \geq \ldots \geq \lambda _n . }$
Consider the quadratic form
\[{ Q _A (x) := x ^{\ast} A x . }\]Note that ${ Q _A (x) \in \mathbb{R} . }$
Q) Can we express the eigenvalues ${ \lambda _1 \geq \ldots \geq \lambda _n }$ of ${ A }$ in terms of the quadratic form ${ Q _A (x) = x ^{\ast} A x }$?
Firstly, the quadratic form scales as
\[{ Q _A (\lambda x) = \vert \lambda \vert ^2 Q _A (x) . }\]Hence it suffices to study the values of the quadratic form over the unit sphere ${ \lbrace \lVert x \rVert = 1 \rbrace . }$
Further, note that
\[{ {\begin{aligned} &\, Q _A (x) \\ = &\, x ^{\ast} (U \text{diag}(\lambda _1, \ldots, \lambda _n) U ^{\ast}) x \\ = &\, (U ^{\ast} x) ^{\ast} \text{diag}(\lambda _1, \ldots, \lambda _n) (U ^{\ast} x) \\ = &\, \lambda _1 \vert (U ^{\ast} x) _1 \vert ^2 + \ldots + \lambda _n \vert (U ^{\ast} x) _n \vert ^2 . \end{aligned}} }\]Behaviour of ${ Q _A (x) }$ for ${ x }$ in the section ${ x \in \text{span}(U _1, \ldots, U _j) \cap \lbrace \lVert x \rVert = 1 \rbrace }$:
Any point ${ x \in \text{span}(U _1, \ldots, U _j) \cap \lbrace \lVert x \rVert = 1 \rbrace }$ looks like
\[{ x = U \begin{pmatrix} z _1 \\ \vdots \\ z _j \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \quad \text{where } \vert z _1 \vert ^2 + \ldots + \vert z _j \vert ^2 = 1 . }\]For such ${ x }$ we have
\[{ {\begin{aligned} &\, Q _A (x) \\ = &\, \lambda _1 \vert z _1 \vert ^2 + \ldots + \lambda _j \vert z _j \vert ^2 \\ \geq &\, \lambda _j (\vert z _1 \vert ^2 + \ldots + \vert z _j \vert ^2) \\ = &\, \lambda _j \end{aligned}} }\]with equality occurring at ${ x = U _j . }$
Hence
\[{ \min _{x \in \text{span}(U _1, \ldots, U _j) \cap \lbrace \lVert x \rVert = 1 \rbrace} Q _A (x) = \lambda _j . }\]We have sort of extracted ${ \lambda _j }$ from ${ Q _A (x) }$… Except we want to remove the dependence on ${ \text{span}(U _1, \ldots, U _j) }$…
This suggests studying the function on subspaces
\[{ {\begin{aligned} &\, \psi _j (\mathcal{V}) := \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace} Q _A (x), \\ &\, \text{for subspaces with } \dim(\mathcal{V}) = j . \end{aligned}} }\]Note that the function ${ \psi _j (\mathcal{V}) }$ is well-defined by compactness of ${ \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace . }$
We saw that
\[{ \psi _j (\text{span}(U _1, \ldots, U _j)) = \lambda _j . }\]What is the behaviour of ${ \psi _j (\mathcal{V}) }$ for other inputs ${ \mathcal{V} }$ with ${ \dim (\mathcal{V}) = j }$?
Let ${ \mathcal{V} }$ be a subspace of dimension ${ \dim(\mathcal{V}) = j. }$
We will show that
\[{ \text{Will show:} \quad \psi _j (\mathcal{V}) \leq \lambda _j }\]that is
\[{ \text{Will show:} \quad \exists x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace \text{ with } Q _A (x) \leq \lambda _j . }\]We will look for such a vector…
Consider the intersection
\[{ \text{span}(U _j, \ldots, U _n) \cap \mathcal{V} . }\]Note that the intersection is non-empty
\[{ \text{span}(U _j, \ldots, U _n) \cap \mathcal{V} \neq \emptyset }\]because otherwise
\[{ {\begin{aligned} &\, \dim(\text{span}(U _j, \ldots, U _n) + \mathcal{V}) \\ = &\, \dim(\text{span}(U _j, \ldots, U _n)) + \dim(\mathcal{V}) \\ = &\, (n - j + 1) + j \\ = &\, (n + 1) > n, \end{aligned}} }\]a contradiction.
In a more geometric language, the subspaces ${ \text{span}(U _j, \ldots, U _n) }$ and ${ \mathcal{V} }$ intersect on atleast one line.
Hence pick
\[{ x \in \text{span}(U _j, \ldots, U _n) \cap \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace . }\]For this ${ x , }$ it looks like
\[{ x = U \begin{pmatrix} 0 \\ \vdots \\ 0 \\ z _j \\ \vdots \\ z _n \end{pmatrix} \text{ where } \vert z _j \vert ^2 + \ldots + \vert z _n \vert ^2 = 1; \, \, x \in \mathcal{V} . }\]For this ${ x }$ we have
\[{ {\begin{aligned} &\, Q _A (x) \\ = &\, \lambda _j \vert z _j \vert ^2 + \ldots + \lambda _n \vert z _n \vert ^2 \\ \leq &\, \lambda _j (\vert z _j \vert ^2 + \ldots + \vert z _n \vert ^2) \\ = &\, \lambda _j \end{aligned}} }\]as needed.
Hence
\[{ {\begin{aligned} &\, \psi _j (\text{span}(U _1, \ldots, U _j)) = \lambda _j, \\ &\, \psi _j (\mathcal{V}) \leq \lambda _j \text{ for any subspace with } \dim(\mathcal{V}) = j . \end{aligned}} }\]Hence
\[{ \max _{\dim(\mathcal{V}) = j} \psi _j (\mathcal{V}) = \lambda _j . }\]Hence
\[{ \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } Q _A (x) \right) = \lambda _j . }\]To summarize:
Thm [Courant-Fischer, max-min]
Let ${ A \in \mathbb{C} ^{n \times n} }$ be a Hermitian matrix with eigenvalues ${ \lambda _1 \geq \ldots \geq \lambda _n .}$
Then the eigenvalues can be characterised as
\[{ \boxed{ \lambda _j = \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) }. }\]Thm [Courant-Fischer, min-max]
Let ${ A \in \mathbb{C} ^{n \times n} }$ be a Hermitian matrix with eigenvalues ${ \lambda _1 \geq \ldots \geq \lambda _n .}$
Then the eigenvalues can be characterised as
\[{ \boxed{\lambda _j = \min _{\dim(\mathcal{V}) = n - j + 1} \left( \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) }. }\]Pf: Note that ${ (-A) }$ is a Hermitian matrix with eigenvalues ${ (- \lambda _n ) \geq \ldots \geq (- \lambda _1). }$
Applying Courant-Fischer max-min for ${ (-A) , }$ we have
\[{ {\begin{aligned} &\, (-\lambda _j) \\ = &\, \max _{\dim(\mathcal{V}) = n - j + 1} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} (- A) x \right) \\ = &\, \max _{\dim(\mathcal{V}) = n - j + 1} \left( - \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) \\ = &\, - \min _{\dim(\mathcal{V}) = n - j + 1} \left( \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) \end{aligned}} }\]as needed. ${ \blacksquare }$
[Weyl’s inequalities]
Let ${ A, B \in \mathbb{C} ^{n \times n} }$ be Hermitian matrices, with eigenvalues ${ \lambda _1 (A) \geq \ldots \geq \lambda _n (A) }$ and ${ \lambda _1 (B) \geq \ldots \geq \lambda _n (B) }$ respectively.
Note that ${ A + B }$ is a Hermitian matrix, with eigenvalues ${ \lambda _1 (A + B) \geq \ldots \geq \lambda _n (A + B). }$
Q) Can we relate the eigenvalues of ${ A + B }$ to the eigenvalues of ${ A, B }$?
We will first try to upper bound eigenvalues of ${ A + B }$ using eigenvalues of ${ A , B . }$
Note that by the max-min theorem,
\[{ {\begin{aligned} &\, \lambda _{\alpha} (A + B) \leq L \\ \iff &\, \max _{\dim(\mathcal{V}) = \alpha} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} (A + B) x \right) \leq L \\ \iff &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} (A + B) x \leq L \\ \iff &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \text{There exists } x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace \\ &\, \text{With } x ^{\ast} (A + B) x \leq L . \end{aligned}} }\]Consider two eigenvalues ${ \lambda _i (A) , \lambda _j (B) . }$
Note that there exists a subspace ${ W _{A} }$ such that
\[{ {\begin{aligned} &\, \dim(W _A) = (n - i + 1) , \\ &\, \max _{x \in W _A \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x = \lambda _i (A) . \end{aligned}} }\]Note that there exists a subspace ${ W _B }$ such that
\[{ {\begin{aligned} &\, \dim(W _B) = (n - j + 1) , \\ &\, \max _{x \in W _B \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} B x = \lambda _j (A) . \end{aligned}} }\]Note that
\[{ {\begin{aligned} \dim(W _A \cap W _B) \geq &\, \dim(W _A) + \dim(W _B) - n \\ = &\, n - i - j + 2 . \end{aligned}} }\]We want this dimension to be ${ > 0 , }$ that is
\[{ \text{Want:} \quad n - i - j + 2 \geq 1 }\]that is
\[{ \text{Want:} \quad i + j \leq n + 1 . }\]Suppose we have
\[{ \boxed{ \text{Suppose:} \quad i + j \leq n + 1 } . }\]Hence the subspaces ${ W _A , W _B }$ intersect on atleast one line.
Note that
\[{ {\begin{aligned} &\, \text{For } x \in W _A \cap W _B \cap \lbrace \lVert x \rVert = 1 \rbrace , \\ &\, x ^{\ast} (A + B) x \leq \lambda _i (A) + \lambda _j (B) . \end{aligned}} }\]We are getting close to connect this to the “${ \lambda _{\alpha} (A + B) \leq L }$ criterion”…
We will choose an ${ \alpha }$ such that ${ \lambda _{\alpha} (A + B) \leq \lambda _i (A) + \lambda _j (B) }$ works…
Note that with this setup
\[{ {\begin{aligned} &\, \lambda _{\alpha} (A + B) \leq \lambda _i (A) + \lambda _j (B) \\ \iff &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \text{There exists } x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace \\ &\, \text{With } x ^{\ast} (A + B) x \leq \lambda _i (A) + \lambda _j (B) \\ \impliedby &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \mathcal{V} \cap W _A \cap W _B \neq \emptyset . \end{aligned}} }\]Let ${ \mathcal{V} }$ be a subspace with ${ \dim(\mathcal{V}) = \alpha . }$
Note that
\[{ {\begin{aligned} \dim(\mathcal{V} \cap W _A \cap W _B) \geq &\, \dim(\mathcal{V}) + \dim(W _A \cap W _B) - n \\ \geq &\, \alpha + (n - i - j + 2) - n \\ = &\, (\alpha - i - j + 2) . \end{aligned}} }\]We want this dimension to be ${ > 0 , }$ that is
\[{ \text{Want:} \quad (\alpha - i - j + 2) \geq 1 }\]that is
\[{ \text{Want:} \quad \alpha \geq i + j - 1 . }\]Suppose we have
\[{ \boxed{ \text{Suppose:} \quad \alpha = i + j - 1 }. }\]Note that we have
\[{ {\begin{aligned} &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = i + j - 1, \\ &\, \mathcal{V} \cap W _A \cap W _B \neq \emptyset \end{aligned}} }\]Hence
\[{ \lambda _{i + j - 1} (A + B) \leq \lambda _i (A) + \lambda _j (B) . }\]To summarize:
Thm [Weyl’s inequalities, Upper Bounds]
Let ${ A, B \in \mathbb{C} ^{n \times n} }$ be Hermitian matrices.
If ${ i + j \leq n + 1 , }$ then
\[{ \boxed{ \lambda _{i + j - 1} (A + B) \leq \lambda _i (A) + \lambda _j (B) } . }\]Thm [Weyl’s inequalities, Lower Bounds]
Let ${ A, B \in \mathbb{C} ^{n \times n} }$ be Hermitian matrices.
If ${ i + j > n , }$ then
\[{ \boxed{\lambda _i (A) + \lambda _j (B) \leq \lambda _{i + j - n} (A + B) } . }\]Pf: Note that if ${ (n - i +1) + (n - j +1) \leq n + 1, }$ then
\[{ \lambda _{2n - i - j + 1} (A + B) \leq \lambda _{n - i + 1} (A) + \lambda _{n - j + 1} (B) }\]that is
\[{ (- \lambda _{n - i + 1} (A)) + (- \lambda _{n - j + 1} (B)) \leq (- \lambda _{2n - i - j + 1} (A + B)) }\]that is
\[{ \lambda _i (-A) + \lambda _j (-B) \leq \lambda _{i + j - n} (-(A + B)) . }\]Replacing ${ A, B }$ by ${ (-A), (-B) }$ we get the result. ${ \blacksquare }$
[Courant-Fischer for singular values]
Thm: Let ${ A \in \mathbb{R} ^{m \times n} . }$ Let ${ p = \text{min} \lbrace m, n \rbrace . }$
Then for ${ j = 1, \ldots, p , }$
\[{ \boxed{ \sigma _j = \max _{\mathcal{V} \subseteq \mathbb{R} ^{n}, \dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace} \lVert A x \rVert \right) } }\]and
\[{ \boxed{ \sigma _j = \min _{\mathcal{V} \subseteq \mathbb{R} ^{n}, \dim(\mathcal{V}) = n - j + 1 } \left( \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace} \lVert Ax \rVert \right) } . }\]Pf: We will show the first part (using Courant-Fischer max-min). The second part follows similarly (using Courant-Fischer min-max).
Note that
\[{ {\begin{aligned} &\, \sigma _j \\ = &\, (\lambda _j (A ^{T} A)) ^{1/2} \\ = &\, \left( \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{T} A ^{T} A x \right) \right) ^{1/2} \\ = &\, \left( \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } \lVert A x \rVert ^2 \right) \right) ^{1/2} \\ = &\, \left( \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } \lVert A x \rVert \right) ^2 \right) ^{1/2} \\ = &\, \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } \lVert A x \rVert \right) \end{aligned}} }\]as needed. ${ \blacksquare }$
[Weyl’s inequalities for singular values]
Thm: Let ${ A, B \in \mathbb{R} ^{m \times n} . }$ Let ${ p = \min \lbrace m, n \rbrace . }$
If ${ i + j \leq p + 1, }$ and ${ i \leq p , }$ and ${ j \leq p , }$ we have
\[{ \boxed{ \sigma _{i + j - 1} (A + B) \leq \sigma _i (A) + \sigma _j (B) } . }\]Pf: Effectively repeat the proof for the original Weyl’s inequalities. (Use Courant-Fischer for singular values in the process). ${ \blacksquare }$
Cor [Lipschitz variation of singular values]
Let ${ A, B \in \mathbb{R} ^{m \times n} . }$ Let ${ p = \min \lbrace m, n \rbrace . }$
Then for ${ j = 1, \ldots, p , }$
\[{ \vert \sigma _j (A) - \sigma _j (B) \vert \leq \sigma _1 (A - B) = \lVert A - B \rVert _{\text{op}} . }\]Pf: Note that
\[{ \sigma _{1 + j - 1} (B - A + A) \leq \sigma _1 (B - A) + \sigma _j (A) . }\]That is,
\[{ \sigma _j (B) - \sigma _j (A) \leq \sigma _1 (B - A) = \lVert B - A \rVert _{\text{op}} . }\]Interchanging the roles of ${ A, B, }$
\[{ \sigma _j (A) - \sigma _j (B) \leq \sigma _1 (A - B) = \lVert B - A \rVert _{\text{op}} . }\]Hence
\[{ \vert \sigma _j (A) - \sigma _j (B) \vert \leq \sigma _1 (A - B) = \lVert A - B \rVert _{\text{op}} }\]as needed. ${ \blacksquare }$