Blog (mostly math)

Weyl's inequalities

Refs:

  • Main refs. Wiki pages: Min-max theorem, Weyl’s inequality.

  • “Inference and Learning from Data” by Sayed.

  • Qiaochu Yuan’s blogpost: Link.

  • “Matrix Analysis and Applied Linear Algebra” by Meyer.

  • “Analysis and Linear Algebra” by Bisgard.

[Courant-Fischer Theorem]

Let ${ A \in \mathbb{C} ^{n \times n} }$ be a Hermitian matrix.

Note that by the spectral theorem,

\[{ A U = U \text{diag}(\lambda _1, \ldots, \lambda _n) }\]

for some orthonormal matrix ${ U \in \mathbb{C} ^{n \times n} }$ and reals ${ \lambda _1 \geq \ldots \geq \lambda _n . }$

Consider the quadratic form

\[{ Q _A (x) := x ^{\ast} A x . }\]

Note that ${ Q _A (x) \in \mathbb{R} . }$

Q) Can we express the eigenvalues ${ \lambda _1 \geq \ldots \geq \lambda _n }$ of ${ A }$ in terms of the quadratic form ${ Q _A (x) = x ^{\ast} A x }$?

Firstly, the quadratic form scales as

\[{ Q _A (\lambda x) = \vert \lambda \vert ^2 Q _A (x) . }\]

Hence it suffices to study the values of the quadratic form over the unit sphere ${ \lbrace \lVert x \rVert = 1 \rbrace . }$

Further, note that

\[{ {\begin{aligned} &\, Q _A (x) \\ = &\, x ^{\ast} (U \text{diag}(\lambda _1, \ldots, \lambda _n) U ^{\ast}) x \\ = &\, (U ^{\ast} x) ^{\ast} \text{diag}(\lambda _1, \ldots, \lambda _n) (U ^{\ast} x) \\ = &\, \lambda _1 \vert (U ^{\ast} x) _1 \vert ^2 + \ldots + \lambda _n \vert (U ^{\ast} x) _n \vert ^2 . \end{aligned}} }\]

Behaviour of ${ Q _A (x) }$ for ${ x }$ in the section ${ x \in \text{span}(U _1, \ldots, U _j) \cap \lbrace \lVert x \rVert = 1 \rbrace }$:

Any point ${ x \in \text{span}(U _1, \ldots, U _j) \cap \lbrace \lVert x \rVert = 1 \rbrace }$ looks like

\[{ x = U \begin{pmatrix} z _1 \\ \vdots \\ z _j \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \quad \text{where } \vert z _1 \vert ^2 + \ldots + \vert z _j \vert ^2 = 1 . }\]

For such ${ x }$ we have

\[{ {\begin{aligned} &\, Q _A (x) \\ = &\, \lambda _1 \vert z _1 \vert ^2 + \ldots + \lambda _j \vert z _j \vert ^2 \\ \geq &\, \lambda _j (\vert z _1 \vert ^2 + \ldots + \vert z _j \vert ^2) \\ = &\, \lambda _j \end{aligned}} }\]

with equality occurring at ${ x = U _j . }$

Hence

\[{ \min _{x \in \text{span}(U _1, \ldots, U _j) \cap \lbrace \lVert x \rVert = 1 \rbrace} Q _A (x) = \lambda _j . }\]

We have sort of extracted ${ \lambda _j }$ from ${ Q _A (x) }$… Except we want to remove the dependence on ${ \text{span}(U _1, \ldots, U _j) }$…

This suggests studying the function on subspaces

\[{ {\begin{aligned} &\, \psi _j (\mathcal{V}) := \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace} Q _A (x), \\ &\, \text{for subspaces with } \dim(\mathcal{V}) = j . \end{aligned}} }\]

Note that the function ${ \psi _j (\mathcal{V}) }$ is well-defined by compactness of ${ \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace . }$

We saw that

\[{ \psi _j (\text{span}(U _1, \ldots, U _j)) = \lambda _j . }\]

What is the behaviour of ${ \psi _j (\mathcal{V}) }$ for other inputs ${ \mathcal{V} }$ with ${ \dim (\mathcal{V}) = j }$?

Let ${ \mathcal{V} }$ be a subspace of dimension ${ \dim(\mathcal{V}) = j. }$

We will show that

\[{ \text{Will show:} \quad \psi _j (\mathcal{V}) \leq \lambda _j }\]

that is

\[{ \text{Will show:} \quad \exists x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace \text{ with } Q _A (x) \leq \lambda _j . }\]

We will look for such a vector…

Consider the intersection

\[{ \text{span}(U _j, \ldots, U _n) \cap \mathcal{V} . }\]

Note that the intersection is non-empty

\[{ \text{span}(U _j, \ldots, U _n) \cap \mathcal{V} \neq \emptyset }\]

because otherwise

\[{ {\begin{aligned} &\, \dim(\text{span}(U _j, \ldots, U _n) + \mathcal{V}) \\ = &\, \dim(\text{span}(U _j, \ldots, U _n)) + \dim(\mathcal{V}) \\ = &\, (n - j + 1) + j \\ = &\, (n + 1) > n, \end{aligned}} }\]

a contradiction.

In a more geometric language, the subspaces ${ \text{span}(U _j, \ldots, U _n) }$ and ${ \mathcal{V} }$ intersect on atleast one line.

Hence pick

\[{ x \in \text{span}(U _j, \ldots, U _n) \cap \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace . }\]

For this ${ x , }$ it looks like

\[{ x = U \begin{pmatrix} 0 \\ \vdots \\ 0 \\ z _j \\ \vdots \\ z _n \end{pmatrix} \text{ where } \vert z _j \vert ^2 + \ldots + \vert z _n \vert ^2 = 1; \, \, x \in \mathcal{V} . }\]

For this ${ x }$ we have

\[{ {\begin{aligned} &\, Q _A (x) \\ = &\, \lambda _j \vert z _j \vert ^2 + \ldots + \lambda _n \vert z _n \vert ^2 \\ \leq &\, \lambda _j (\vert z _j \vert ^2 + \ldots + \vert z _n \vert ^2) \\ = &\, \lambda _j \end{aligned}} }\]

as needed.

Hence

\[{ {\begin{aligned} &\, \psi _j (\text{span}(U _1, \ldots, U _j)) = \lambda _j, \\ &\, \psi _j (\mathcal{V}) \leq \lambda _j \text{ for any subspace with } \dim(\mathcal{V}) = j . \end{aligned}} }\]

Hence

\[{ \max _{\dim(\mathcal{V}) = j} \psi _j (\mathcal{V}) = \lambda _j . }\]

Hence

\[{ \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } Q _A (x) \right) = \lambda _j . }\]

To summarize:

Thm [Courant-Fischer, max-min]

Let ${ A \in \mathbb{C} ^{n \times n} }$ be a Hermitian matrix with eigenvalues ${ \lambda _1 \geq \ldots \geq \lambda _n .}$

Then the eigenvalues can be characterised as

\[{ \boxed{ \lambda _j = \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) }. }\]

Thm [Courant-Fischer, min-max]

Let ${ A \in \mathbb{C} ^{n \times n} }$ be a Hermitian matrix with eigenvalues ${ \lambda _1 \geq \ldots \geq \lambda _n .}$

Then the eigenvalues can be characterised as

\[{ \boxed{\lambda _j = \min _{\dim(\mathcal{V}) = n - j + 1} \left( \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) }. }\]

Pf: Note that ${ (-A) }$ is a Hermitian matrix with eigenvalues ${ (- \lambda _n ) \geq \ldots \geq (- \lambda _1). }$

Applying Courant-Fischer max-min for ${ (-A) , }$ we have

\[{ {\begin{aligned} &\, (-\lambda _j) \\ = &\, \max _{\dim(\mathcal{V}) = n - j + 1} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} (- A) x \right) \\ = &\, \max _{\dim(\mathcal{V}) = n - j + 1} \left( - \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) \\ = &\, - \min _{\dim(\mathcal{V}) = n - j + 1} \left( \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x \right) \end{aligned}} }\]

as needed. ${ \blacksquare }$

[Weyl’s inequalities]

Let ${ A, B \in \mathbb{C} ^{n \times n} }$ be Hermitian matrices, with eigenvalues ${ \lambda _1 (A) \geq \ldots \geq \lambda _n (A) }$ and ${ \lambda _1 (B) \geq \ldots \geq \lambda _n (B) }$ respectively.

Note that ${ A + B }$ is a Hermitian matrix, with eigenvalues ${ \lambda _1 (A + B) \geq \ldots \geq \lambda _n (A + B). }$

Q) Can we relate the eigenvalues of ${ A + B }$ to the eigenvalues of ${ A, B }$?

We will first try to upper bound eigenvalues of ${ A + B }$ using eigenvalues of ${ A , B . }$

Note that by the max-min theorem,

\[{ {\begin{aligned} &\, \lambda _{\alpha} (A + B) \leq L \\ \iff &\, \max _{\dim(\mathcal{V}) = \alpha} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} (A + B) x \right) \leq L \\ \iff &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} (A + B) x \leq L \\ \iff &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \text{There exists } x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace \\ &\, \text{With } x ^{\ast} (A + B) x \leq L . \end{aligned}} }\]

Consider two eigenvalues ${ \lambda _i (A) , \lambda _j (B) . }$

Note that there exists a subspace ${ W _{A} }$ such that

\[{ {\begin{aligned} &\, \dim(W _A) = (n - i + 1) , \\ &\, \max _{x \in W _A \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} A x = \lambda _i (A) . \end{aligned}} }\]

Note that there exists a subspace ${ W _B }$ such that

\[{ {\begin{aligned} &\, \dim(W _B) = (n - j + 1) , \\ &\, \max _{x \in W _B \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{\ast} B x = \lambda _j (A) . \end{aligned}} }\]

Note that

\[{ {\begin{aligned} \dim(W _A \cap W _B) \geq &\, \dim(W _A) + \dim(W _B) - n \\ = &\, n - i - j + 2 . \end{aligned}} }\]

We want this dimension to be ${ > 0 , }$ that is

\[{ \text{Want:} \quad n - i - j + 2 \geq 1 }\]

that is

\[{ \text{Want:} \quad i + j \leq n + 1 . }\]

Suppose we have

\[{ \boxed{ \text{Suppose:} \quad i + j \leq n + 1 } . }\]

Hence the subspaces ${ W _A , W _B }$ intersect on atleast one line.

Note that

\[{ {\begin{aligned} &\, \text{For } x \in W _A \cap W _B \cap \lbrace \lVert x \rVert = 1 \rbrace , \\ &\, x ^{\ast} (A + B) x \leq \lambda _i (A) + \lambda _j (B) . \end{aligned}} }\]

We are getting close to connect this to the “${ \lambda _{\alpha} (A + B) \leq L }$ criterion”…

We will choose an ${ \alpha }$ such that ${ \lambda _{\alpha} (A + B) \leq \lambda _i (A) + \lambda _j (B) }$ works…

Note that with this setup

\[{ {\begin{aligned} &\, \lambda _{\alpha} (A + B) \leq \lambda _i (A) + \lambda _j (B) \\ \iff &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \text{There exists } x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace \\ &\, \text{With } x ^{\ast} (A + B) x \leq \lambda _i (A) + \lambda _j (B) \\ \impliedby &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = \alpha, \\ &\, \mathcal{V} \cap W _A \cap W _B \neq \emptyset . \end{aligned}} }\]

Let ${ \mathcal{V} }$ be a subspace with ${ \dim(\mathcal{V}) = \alpha . }$

Note that

\[{ {\begin{aligned} \dim(\mathcal{V} \cap W _A \cap W _B) \geq &\, \dim(\mathcal{V}) + \dim(W _A \cap W _B) - n \\ \geq &\, \alpha + (n - i - j + 2) - n \\ = &\, (\alpha - i - j + 2) . \end{aligned}} }\]

We want this dimension to be ${ > 0 , }$ that is

\[{ \text{Want:} \quad (\alpha - i - j + 2) \geq 1 }\]

that is

\[{ \text{Want:} \quad \alpha \geq i + j - 1 . }\]

Suppose we have

\[{ \boxed{ \text{Suppose:} \quad \alpha = i + j - 1 }. }\]

Note that we have

\[{ {\begin{aligned} &\, \text{For all } \mathcal{V} \text{ with } \dim(\mathcal{V}) = i + j - 1, \\ &\, \mathcal{V} \cap W _A \cap W _B \neq \emptyset \end{aligned}} }\]

Hence

\[{ \lambda _{i + j - 1} (A + B) \leq \lambda _i (A) + \lambda _j (B) . }\]

To summarize:

Thm [Weyl’s inequalities, Upper Bounds]

Let ${ A, B \in \mathbb{C} ^{n \times n} }$ be Hermitian matrices.

If ${ i + j \leq n + 1 , }$ then

\[{ \boxed{ \lambda _{i + j - 1} (A + B) \leq \lambda _i (A) + \lambda _j (B) } . }\]

Thm [Weyl’s inequalities, Lower Bounds]

Let ${ A, B \in \mathbb{C} ^{n \times n} }$ be Hermitian matrices.

If ${ i + j > n , }$ then

\[{ \boxed{\lambda _i (A) + \lambda _j (B) \leq \lambda _{i + j - n} (A + B) } . }\]

Pf: Note that if ${ (n - i +1) + (n - j +1) \leq n + 1, }$ then

\[{ \lambda _{2n - i - j + 1} (A + B) \leq \lambda _{n - i + 1} (A) + \lambda _{n - j + 1} (B) }\]

that is

\[{ (- \lambda _{n - i + 1} (A)) + (- \lambda _{n - j + 1} (B)) \leq (- \lambda _{2n - i - j + 1} (A + B)) }\]

that is

\[{ \lambda _i (-A) + \lambda _j (-B) \leq \lambda _{i + j - n} (-(A + B)) . }\]

Replacing ${ A, B }$ by ${ (-A), (-B) }$ we get the result. ${ \blacksquare }$

[Courant-Fischer for singular values]

Thm: Let ${ A \in \mathbb{R} ^{m \times n} . }$ Let ${ p = \text{min} \lbrace m, n \rbrace . }$

Then for ${ j = 1, \ldots, p , }$

\[{ \boxed{ \sigma _j = \max _{\mathcal{V} \subseteq \mathbb{R} ^{n}, \dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace} \lVert A x \rVert \right) } }\]

and

\[{ \boxed{ \sigma _j = \min _{\mathcal{V} \subseteq \mathbb{R} ^{n}, \dim(\mathcal{V}) = n - j + 1 } \left( \max _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace} \lVert Ax \rVert \right) } . }\]

Pf: We will show the first part (using Courant-Fischer max-min). The second part follows similarly (using Courant-Fischer min-max).

Note that

\[{ {\begin{aligned} &\, \sigma _j \\ = &\, (\lambda _j (A ^{T} A)) ^{1/2} \\ = &\, \left( \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } x ^{T} A ^{T} A x \right) \right) ^{1/2} \\ = &\, \left( \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } \lVert A x \rVert ^2 \right) \right) ^{1/2} \\ = &\, \left( \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } \lVert A x \rVert \right) ^2 \right) ^{1/2} \\ = &\, \max _{\dim(\mathcal{V}) = j} \left( \min _{x \in \mathcal{V} \cap \lbrace \lVert x \rVert = 1 \rbrace } \lVert A x \rVert \right) \end{aligned}} }\]

as needed. ${ \blacksquare }$

[Weyl’s inequalities for singular values]

Thm: Let ${ A, B \in \mathbb{R} ^{m \times n} . }$ Let ${ p = \min \lbrace m, n \rbrace . }$

If ${ i + j \leq p + 1, }$ and ${ i \leq p , }$ and ${ j \leq p , }$ we have

\[{ \boxed{ \sigma _{i + j - 1} (A + B) \leq \sigma _i (A) + \sigma _j (B) } . }\]

Pf: Effectively repeat the proof for the original Weyl’s inequalities. (Use Courant-Fischer for singular values in the process). ${ \blacksquare }$

Cor [Lipschitz variation of singular values]

Let ${ A, B \in \mathbb{R} ^{m \times n} . }$ Let ${ p = \min \lbrace m, n \rbrace . }$

Then for ${ j = 1, \ldots, p , }$

\[{ \vert \sigma _j (A) - \sigma _j (B) \vert \leq \sigma _1 (A - B) = \lVert A - B \rVert _{\text{op}} . }\]

Pf: Note that

\[{ \sigma _{1 + j - 1} (B - A + A) \leq \sigma _1 (B - A) + \sigma _j (A) . }\]

That is,

\[{ \sigma _j (B) - \sigma _j (A) \leq \sigma _1 (B - A) = \lVert B - A \rVert _{\text{op}} . }\]

Interchanging the roles of ${ A, B, }$

\[{ \sigma _j (A) - \sigma _j (B) \leq \sigma _1 (A - B) = \lVert B - A \rVert _{\text{op}} . }\]

Hence

\[{ \vert \sigma _j (A) - \sigma _j (B) \vert \leq \sigma _1 (A - B) = \lVert A - B \rVert _{\text{op}} }\]

as needed. ${ \blacksquare }$

comments powered by Disqus