Blog (mostly math)

Differentiation-3

Link to previous part

ROUGH NOTES (!)
Updated: 12/7/24

Second derivative; Third derivative; Higher derivatives; Taylor’s theorem; Inverse function theorem; Implicit function theorem

Back to top.
\[{ \underline{\textbf{Second derivative}} }\]

Def [${ C ^2 }$ maps]: Let ${ E, F }$ be complete normed spaces, and ${ f : U (\subseteq E \text{ open}) \to F .}$
We say ${ f }$ is ${ C ^2 }$ if it is ${ C ^1 }$ and the derivative ${ Df : U \to L(E, F) }$ is ${ C ^1 , }$ that is if the derivatives

\[{ Df : U \longrightarrow L(E, F), }\] \[{ D(Df) = D^2 f : U \longrightarrow L(E, L(E, F)) }\]

exist and are continuous.

The codomain ${ L (E, L(E, F)) }$ above is infact the space of continuous bilinear maps ${ L ^2 (E; F) .}$

Obs [${ L(E, L(E, F)) \cong L ^2 (E; F) }$ as normed spaces]:
Let ${ E, F }$ be complete normed spaces. Recall ${ L ^k (E; F) }$ is the space of continuous multilinear maps ${ \underbrace{E \times \ldots \times E} _{k \text{ many}} \to F . }$ Now

\[{ \Phi : L(E, L(E, F)) \longrightarrow L ^{2} (E; F), }\] \[{ \lambda \longmapsto ( (x _1, x _2) \mapsto \lambda (x _1) (x _2) ) }\]

is an isomorphism of normed spaces.

Here ${ \lambda }$ in ${ L(E, L(E, F)) }$ takes inputs as ${ \lambda (\underline{\,}) (\underline{\,}) ,}$ and ${ \Phi (\lambda) }$ in ${ L ^2 (E; F) }$ takes inputs as ${ \Phi (\lambda) (\underline{\,}, \underline{\,}) .}$
Writing (informally) ${ \Phi (\lambda) }$ as ${ \lambda (\underline{\,} , \underline{\,}) ,}$ the normed space isomorphism is

\[{ \begin{align*} &\, L(E, L(E, F)) \overset{\cong}{\longleftrightarrow} L ^2 (E; F) , \\ &\, \quad \lambda (\_) (\_) \quad \longleftrightarrow \quad \lambda (\_ \, , \, \_) . \end{align*} }\]

Pf: Firstly, if ${ \lambda \in L(E, L(E, F)) }$ then

\[{ \Phi(\lambda) : (x _1, x _2) \mapsto \lambda (x _1) (x _2) }\]

is in ${ L ^2 (E; F) }$: Bilinearity of ${ \Phi (\lambda) }$ is clear, and continuity of ${ \Phi(\lambda) }$ is because

\[{ \begin{align*} \lVert \lambda (x _1) (x _2) \rVert \leq &\, \lVert \lambda (x _1) \rVert \lVert x _2 \rVert \quad (\text{as } \lambda(x _1) \in L(E, F)) \\ \leq &\, \lVert \lambda \rVert \lVert x _1 \rVert \lVert x _2 \rVert \quad (\text{as } \lambda \in L(E, L(E, F))). \end{align*} }\]

So ${ \Phi }$ is well-defined.

Linearity of ${ \Phi }$ is clear.

The above inequality gives

\[{ \lVert \Phi (\lambda) \rVert \leq \lVert \lambda \rVert , }\]

but even

\[{ \begin{align*} \lVert \lambda \rVert = &\, \sup _{x _1 \neq 0} \frac{\lVert \lambda (x _1) \rVert _{L(E, F)} }{\lVert x _1 \rVert} \\ = &\, \sup _{x _1\neq 0} \left( \sup _{x _2 \neq 0} \frac{\lVert \lambda(x _1)(x _2)\rVert}{\lVert x _1 \rVert \lVert x _2 \rVert} \right) \\ \leq &\, \sup _{x _1 \neq 0} \lVert \Phi (\lambda) \rVert = \lVert \Phi (\lambda) \rVert , \end{align*} }\]

so

\[{ \lVert \Phi (\lambda) \rVert = \lVert \lambda \rVert }\]

that is ${ \Phi }$ preserves norms.

It is left to show that ${ \Phi }$ is a bijection.

Surjectivity of ${ \Phi }$: Say ${ f \in L ^2 (E; F) . }$ We want an ${ f _0 \in L(E, L(E, F)) }$ such that ${ \Phi (f _0) = f .}$
Informally, defining ${ f _0 := f(\underline{\,}) (\underline{\,}) }$ works. Formally, consider

\[{ f _0 : E \longrightarrow L(E, F) , }\] \[{ x _1 \longmapsto (x _2 \mapsto f(x _1, x _2)) . }\]

Each ${ f _0 (x _1 ) }$ is in ${ L(E, F) , }$ because linearity of ${ f _0 (x _1) : E \to F }$ is clear and

\[{ \sup _{x _2\neq 0} \frac{\lVert f _0 (x _1) (x _2) \rVert}{\lVert x _2 \rVert} = \sup _{x _2 \neq 0} \frac{\lVert f(x _1, x _2) \rVert}{\lVert x _2 \rVert} \leq \lVert f \rVert \lVert x _1 \rVert < \infty . }\]

Now ${ f _0 \in L(E, L(E, F)) , }$ because linearity of ${ f _0 }$ is clear and

\[{ \begin{align*} \sup _{x _1 \neq 0} \frac{\lVert f _0 (x _1) \rVert _{L(E, F)}}{\lVert x _1 \rVert} = &\, \sup _{x _1 \neq 0} \left( \sup _{x _2 \neq 0} \frac{\lVert f _0 (x _1) (x _2) \rVert }{\lVert x _1 \rVert \lVert x _2 \rVert} \right) \\ \leq &\sup _{x _1 \neq 0} \lVert f \rVert = \lVert f \rVert < \infty . \end{align*} }\]

Finally ${ \Phi(f _0) = f }$ because

\[{ \begin{align*} \Phi(f _0) \, (x _1, x _2) = &\, f _0 (x _1) (x _2) \\ = &\, f \, (x _1, x _2) \end{align*} }\]

for all ${ x _1, x _2 \in E . }$

Injectivity of ${ \Phi }$: Say ${ \Phi(\lambda _1) = \Phi (\lambda _2) }$ for some ${ \lambda _1, \lambda _2 \in L(E, L(E, F) ) .}$ Now

\[{ \begin{align*} \lVert \lambda _1 - \lambda _2 \rVert = &\, \lVert \Phi (\lambda _1 - \lambda _2) \rVert \\ = &\, \lVert \Phi(\lambda _1) - \Phi (\lambda _2) \rVert \\ = &\, 0, \end{align*} }\]

so ${ \lambda _1 = \lambda _2 . }$

Therefore ${ \Phi }$ is an isomorphism of normed spaces, as needed. ${ \blacksquare }$

Obs: Any bilinear map

\[{ \lambda : \mathbb{R} ^n \times \mathbb{R} ^n \to \mathbb{R} ^m , \quad \lambda = (\lambda _1, \ldots, \lambda _m) ^T }\]

is continuous because each component

\[{ \lambda _i (x _1, x _2) = \sum _{j _1, j _2 \in [n]} (x _1) _{j _1} (x _2) _{j _2} \lambda _i (e _{j _1}, e _{j _2}) }\]

is continuous.

Obs [${ C ^2 }$ maps ${ \mathbb{R} ^n \to \mathbb{R} ^m }$]:
Consider ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m .}$
From the previous result on ${ C ^1 }$ maps ${ \mathbb{R} ^n \to \mathbb{R} ^m ,}$

\[{ \begin{align*} &(f : U \to \mathbb{R} ^m \text{ is } C ^2) \\ \iff &\left(f \text{ is } C ^1 \text{ and } Df: U \to L(\mathbb{R} ^n, \mathbb{R} ^m) \text{ is } C ^1 \right) \\ \iff &\left(f \text{ is } C ^1 \text{ and } Df: U \to (L(\mathbb{R} ^n, \mathbb{R} ^m), \lVert \cdots \rVert _F) \cong \mathbb{R} ^{mn} \text{ is } C ^1 \right) \\ \iff &\left(\begin{aligned} &\text{All partials } D _{j _1} f _i : U \to \mathbb{R} \text{ exist and are continuous;} \\ &\text{All partials } D _{j _1} D _{j _2} f _i : U \to \mathbb{R} \text{ exist and are continuous} \end{aligned} \right) . \end{align*} }\]

Thm [${ D ^2 f }$ for ${ C ^2 }$ maps ${\mathbb{R} ^n \to \mathbb{R} ^m }$]:
Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m }$ be a ${ C ^2 }$ map and ${ x \in U . }$ Then the second derivative

\[{ D^2 f (x) \in L(\mathbb{R} ^n, L(\mathbb{R} ^n, \mathbb{R} ^m)) \cong L ^2 (\mathbb{R} ^n; \mathbb{R} ^m) }\]

is given by its action on basis vectors

\[{ D ^2 f (x) : \mathbb{R} ^n \times \mathbb{R} ^n \to \mathbb{R} ^m \text{ bilinear,} }\] \[{ \boxed{\begin{align*} D ^2 f (x) \quad \underbrace{(e _{j _1})} _{\text{in } \mathbb{R} ^n} \quad \underbrace{(e _{j _2})} _{\text{in } \mathbb{R} ^n} = &\, \underbrace{\begin{pmatrix} D _{j _1} D _{j _2} f _1 (x) \\ \vdots \\ D _{j _1} D _{j _2} f _m (x) \end{pmatrix}} _{\text{in } \mathbb{R} ^m} \end{align*} } }\]

that is

\[{ D ^2 f (x) (x _1) (x _2) = \sum _{j _1, j _2 \in [n]} (x _1) _{j _1} (x _2) _{j _2} \begin{pmatrix} D _{j _1} D _{j _2} f _1 (x) \\ \vdots \\ D _{j _1} D _{j _2} f _m (x) \end{pmatrix} . }\]

For ${ m = 1 , }$ this second derivative looks like

\[{ D ^2 f (p) : \mathbb{R} ^n \times \mathbb{R} ^n \to \mathbb{R} \text{ bilinear,} }\] \[{ \begin{align*} D ^2 f (p) \, (x) (y) = &\, \sum _{i, j} x _i y _j D ^2 f (p) (e _i) (e _j) \\ = &\, \sum _{i, j} x _i y _j D _i D _j f(p) \\ = &\, x ^T \begin{pmatrix} D _1 D _1 f(p) &\cdots &D _1 D _n f(p) \\ \vdots &\ddots &\vdots \\ D _n D _1 f(p) &\cdots &D _n D _n f(p) \end{pmatrix} y \\ = &\, x ^T H f (p) y \end{align*} }\]

and ${ H f (p) = [D _i D _j f(p)] }$ is called the Hessian.

Pf: Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m }$ be ${ C ^2 }$ and ${ x \in U . }$ By the definition of derivative, for ${ h }$ in a neighbourhood of ${ 0, }$

\[{ \begin{align*} &Df(x+h) = Df(x) + D ^2 f(x) \, h + \lVert h \rVert \varphi (h) \text{ in } L(\mathbb{R} ^n, \mathbb{R} ^m) \\ &\, \text{with } \varphi(0) = 0 \text{ and } \varphi (h) \text{ continuous at } 0 . \end{align*} }\]

So (for ${ t _1 }$ in a neighbourhood of ${ 0 }$)

\[{ \begin{align*} Df(x + t _1 e _{j _1}) \, e _{j _2} = &\, Df (x) \, e _{j _2} + D ^2 f (x) (t _1 e _{j _1}) (e _{j _2}) \\ &\, + \lVert t _1 e _{j _1} \rVert \varphi (t _1 e _{j _1}) \, e _{j _2} \end{align*} }\]

that is

\[{ \begin{align*} \begin{pmatrix} D _{j _2} f _1 (x + t _1 e _{j _1}) \\ \vdots \\ D _{j _2} f _m (x + t _1 e _{j _1}) \end{pmatrix} = &\, \begin{pmatrix} D _{j _2} f _1 (x) \\ \vdots \\ D _{j _2} f _m (x) \end{pmatrix} + t _1 D ^2 f (x) (e _{j _1}) (e _{j _2}) \\ &\, + \vert t _1 \vert \varphi (t _1 e _{j _1}) \, e _{j _2} . \end{align*} }\]

Dividing by ${ t _1 \neq 0 }$ and letting ${ t _1 \to 0 }$ gives

\[{ D ^2 f (x) (e _{j _1}) (e _{j _2}) = \begin{pmatrix} D _{j _1} D _{j _2} f _1 (x) \\ \vdots \\ D _{j _1} D _{j _2} f _m (x) \end{pmatrix} }\]

as needed. ${ \blacksquare }$

Thm [Second derivatives are symmetric]:
Let ${ E, F }$ be complete normed spaces and ${ f : U (\subseteq E \text{ open}) \to F }$ a ${ C ^2 }$ map.
Then for every ${ x \in U, }$ the bilinear map ${ D ^2 f(x) }$ is symmetric that is

\[{ D ^2 f (x) (v) (w) = D ^2 f (x) (w) (v). }\]

So especially if ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} }$ is ${ C ^2 ,}$ its Hessian at every point is symmetric.

Pf: Let ${ x \in U . }$ There is an ${ r > 0 }$ such that ${ B(x, r) \subseteq U .}$

Pick any ${ v, w \in E }$ with lengths ${ \lVert v \rVert, \lVert w \rVert < \frac{r}{2} .}$ The length constraint is so that the points ${ x, x + v, x + w, x + v+w }$ are all in the ball.

As ${ D ^2 f (x) }$ scales as ${ D ^2 f (x) (\lambda v) (\mu w) = \lambda \mu D ^2 f (x) (v) (w) ,}$ it suffices to show

\[{ \text{To show: } D ^2 f (x) (v) (w) = D ^2 f (x) (w) (v) . }\]

Heuristic: We have two approximations

\[{ \begin{align*} f(x + v + w) \approx &\, f(x + v) + Df(x+v) \, w \\ \approx &\, f(x) + Df(x) \, v \\ &\, + [ Df(x) + D ^2 f(x) (v) ] w \end{align*} }\]

and

\[{ \begin{align*} f(x + v + w) \approx &\, f(x + w) + Df(x+w) \, v \\ \approx &\, f(x) + Df(x) \, w \\ &\, + [ Df(x) + D ^2 f(x) (w) ] v . \end{align*} }\]

Comparing both, one expects

\[{ D ^2 f (x) (v) (w) = D ^2 f (x) (w) (v) }\]

to hold.

Consider the difference maps

\[{ G _v, G _w : B (x, r/2) \longrightarrow F, }\] \[{ G _v (t) := f(t+v) - f(t), }\] \[{ G _w (t) := f(t+w) - f(t) . }\]

Applying ${ C ^1 }$ mean value theorem twice,

\[{ \begin{align*} &\, G _v (x+w) - G _v (x) \\ = &\, \left( \int _0 ^1 DG _v (x+tw) \, dt \right) \cdot w \\ = &\, \int _0 ^1 [ Df (x + tw + v) - Df (x + tw ) ] \, dt \cdot w \\ = &\, \int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tw + sv) \cdot v \, ds \right] \, dt \cdot w \\ = &\, \int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tw + sv) \, ds \right] \, dt \, (v) (w) \end{align*} }\]

and similarly

\[{ \begin{align*} &G _w (x + v) - G _w (x) \\ = &\, \int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tv + sw) \, ds \right] \, dt \, (w) (v). \end{align*} }\]

But note the differences

\[{ G _v (x + w) - G _v (x) = f(x+w+v) - f(x+w) - (f(x+v) - f(x)) }\] \[{ G _w (x + v) - G _w (x ) = f(x + v + w) - f(x+v) - (f(x+w) - f(x)) }\]

are equal, so

\[{ \begin{align*} &\int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tw + sv) \, ds \right] \, dt \, (v) (w) \\ = &\, \int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tv + sw) \, ds \right] \, dt \, (w) (v) \end{align*} }\]

that is

\[{ \begin{align*} &\, D ^2 f (x) (v) (w) + \underbrace{\int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tw + sv) - D ^2 f (x) \, ds \right] \, dt \, (v) (w)} _{=: \, \Phi (v, w)} \\ = &\, D ^2 f (x) (w) (v) + \underbrace{\int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tv + sw) - D ^2 f (x) \, ds \right] \, dt \, (w) (v)} _{=: \, \Psi (v, w)} . \end{align*} }\]

The residues ${ \Phi (v, w) , \Psi (v, w) }$ are bounded by

\[{ \begin{align*} &\lVert \Phi (v, w) \rVert \\ \leq &\, \left\lVert \int _0 ^1 \left[ \int _0 ^1 D ^2 f (x + tw + sv) - D ^2 f (x) \, ds \right] \, dt \right\rVert \lVert v \rVert \lVert w \rVert \\ \leq &\, \sup _{0 \leq t \leq 1} \left\lVert \int _0 ^1 D ^2 f (x + tw + sv) - D ^2 f (x) \, ds\right\rVert \lVert v \rVert \lVert w \rVert \\ \leq &\, \sup _{0 \leq t \leq 1} \left( \sup _{0 \leq s \leq 1} \left\lVert D ^2 f (x + tw + sv) - D ^2 f (x) \right\rVert \right) \lVert v \rVert \lVert w \rVert \\ \leq &\, \sup _{0 \leq t \leq 1} \left( \sup _{0 \leq s, t \leq 1} \left\lVert D ^2 f (x + tw + sv) - D ^2 f (x) \right\rVert \right) \lVert v \rVert \lVert w \rVert \\ = &\, \sup _{0 \leq s, t \leq 1} \left\lVert D ^2 f (x + tw + sv) - D ^2 f (x) \right\rVert \lVert v \rVert \lVert w \rVert \end{align*} }\]

and

\[{ \lVert \Psi (v, w) \rVert \leq \sup _{0 \leq s, t \leq 1} \left\lVert D ^2 f (x + tv + sw) - D ^2 f (x) \right\rVert \lVert w \rVert \lVert v \rVert . }\]

So

\[{ \begin{align*} &\lVert D ^2 f(x) (v)(w) - D ^2 f (x) (w) (v) \rVert \\ = &\, \lVert \Phi (v, w) - \Psi (v, w) \rVert \\ \leq &\, 2 \sup _{0 \leq s, t \leq 1} \left\lVert D ^2 f (x + tw + sv) - D ^2 f (x) \right\rVert \lVert v \rVert \lVert w \rVert . \end{align*} }\]

Replacing ${ v, w }$ with ${ \tau v, \tau w }$ (where ${ \tau \in (0, 1) }$) in the above inequality and using continuity of ${ D ^2 f , }$

\[{ \begin{align*} &\lVert D ^2 f(x) (v)(w) - D ^2 f (x) (w) (v) \rVert \\ \leq &\, 2 \sup _{0 \leq s, t \leq 1} \left\lVert D ^2 f (x + \tau(tw + sv)) - D ^2 f (x) \right\rVert \lVert v \rVert \lVert w \rVert \\ &\, \to 0 \text{ as } \tau \to 0 . \end{align*} }\]

Therefore

\[{ D ^2 f (x) (v) (w) = D ^2 f (x) (w) (v) }\]

as needed. ${ \blacksquare }$

Back to top.
\[{ \underline{\textbf{Third derivative}} }\]

Def [${ C ^3 }$ maps]: Let ${ E, F }$ be complete normed spaces, and ${ f : U (\subseteq E \text{ open}) \to F .}$
We say ${ f }$ is ${ C ^3 }$ if it is ${ C ^2 }$ and the second derivative is ${ C ^1 , }$ that is if the derivatives

\[{ Df : U \longrightarrow L(E, F) , }\] \[{D(Df) = D ^2 f : U \longrightarrow L (E, L(E, F)), }\] \[{ D (D ^2 f) = D ^3 f : U \longrightarrow L(E, L (E, L(E, F))) }\]

exist and are continuous.

We saw ${ L(E, L(E, F)) \cong L ^2 (E; F) }$ as normed spaces. This can be generalised.

Obs [${ L (E, L ^k (E; F)) \cong L ^{k+1} (E; F) }$ as normed spaces]:
Let ${ E, F }$ be complete normed spaces. Then

\[{ \Phi : L(E, L ^k (E; F)) \longrightarrow L ^{k+1} (E; F), }\] \[{ \lambda \mapsto ((x _1, \ldots, x _{k+1}) \mapsto \lambda (x _1) (x _2, \ldots, x _{k+1})) }\]

is an isomorphism of normed spaces.

Here ${ \lambda }$ in ${ L(E, L ^k (E; F)) }$ takes inputs as ${ \lambda (\underline{\,}) (\underline{\,}, \ldots, \underline{\,}) , }$ and ${ \Phi (\lambda) }$ in ${ L ^{k+1} (E; F) }$ takes inputs as ${ \Phi (\lambda) (\underline{\,} , \underline{\,} , \ldots , \underline{\,} ) . }$
Writing (informally) ${ \Phi (\lambda) }$ as ${ \lambda (\underline{\,} , \underline{\,}, \ldots , \underline{\,}) , }$ the normed space isomorphism is

\[{ \begin{align*} &L (E, L ^k (E; F)) \overset{\cong}{\longleftrightarrow} L ^{k+1} (E; F) , \\ &\lambda (\underline{\,}) (\underline{\,}, \ldots, \underline{\,}) \longleftrightarrow \lambda (\underline{\,} , \underline{\,}, \ldots , \underline{\,}) . \end{align*} }\]

Pf: Similar to the previous proof/verification that ${ L (E, L(E, F)) \cong L ^2 (E; F) }$ as normed spaces.

Obs [${ L(E, \ldots, L(E, L(E, F)) \ldots ) \cong L ^k (E; F) }$ as normed spaces]:
Let ${ E, F }$ be complete normed spaces. Repeatedly using the above result gives

\[{ \begin{align*} \underbrace{L(E, \ldots , L(E, L(E, L(E, F))) \ldots )} _{k \text{ many } Es} \cong &\, \underbrace{L(E, \ldots, L(E, L ^2 (E; F)) \ldots )} _{k-1 \text{ many } Es} \\ \cong &\, \, \cdots \\ \cong &\, L ^k (E; F) \end{align*} }\]

as normed spaces. The composed isomorphism is

\[{\Phi : \underbrace{L(E, \ldots, L(E, F) \ldots)} _{k \text{ many } Es} \longrightarrow L ^k (E; F) , }\] \[{ \lambda \mapsto ((x _1, \ldots, x _k) \mapsto \lambda (x _1) \ldots (x _k)) . }\]

For example when ${ k = 3, }$

\[{ \begin{align*} &L(E, L(E, L(E, F))) \overset{\cong}{\longleftrightarrow} L(E, L ^2 (E; F)) \overset{\cong}{\longleftrightarrow} L ^3 (E; F), \\ &\, \quad \lambda (\_) (\_) (\_) \quad \longleftrightarrow \quad \lambda (\_) ( \_ \, , \, \_ ) \quad \longleftrightarrow \quad \lambda (\_ \, , \, \_ \, , \_ ) . \end{align*} }\]

Obs: Any multilinear map

\[{ \lambda : \underbrace{\mathbb{R} ^n \times \ldots \times \mathbb{R} ^n} _{k \text{ many}} \to \mathbb{R} ^m, \quad \lambda = (\lambda _1, \ldots, \lambda _m) ^T }\]

is continuous because each component

\[{ \lambda _i (x _1, \ldots, x _k ) = \sum _{j _1 , \ldots, j _k \in [n]} (x _1) _{j _1} \ldots (x _k) _{j _k} \lambda _i (e _{j _1}, \ldots, e _{j _k}) }\]

is continuous.

Obs: Any ${ \lambda \in L ^k (\mathbb{R} ^n; \mathbb{R} ^m) }$ expands as above. Note

\[{ \Phi : L ^k (\mathbb{R} ^n ; \mathbb{R} ^m) \longrightarrow \lbrace \text{maps } [n] ^k \times [m] \to \mathbb{R} \rbrace, }\] \[{ \lambda \longmapsto \left( (j _1, \ldots, j _k; i) \mapsto \lambda _i (e _{j _1}, \ldots, e _{j _k}) \right) }\]

is linear and bijective, making it an isomorphism of vector spaces.
Especially ${ L ^k (\mathbb{R} ^n ; \mathbb{R} ^m) }$ is ${ n ^k m }$ dimensional and all norms on it are equivalent.
The right hand side has the Frobenius norm

\[{ \lVert f \rVert _F = \sqrt{\sum f(j _1, \ldots, j _k; i) ^2 }, }\]

therefore

\[{ \lVert \lambda \rVert _{\text{Mult } F} := \lVert \Phi (\lambda) \rVert _F = \sqrt{\sum \lambda _i (e _{j _1}, \ldots, e _{j _k}) ^2 } }\]

is a valid norm on ${ L ^k (\mathbb{R} ^n ; \mathbb{R} ^m ) }$ and makes ${ \Phi }$ an isomorphism of normed spaces.

Obs [${ C ^3 }$ maps ${ \mathbb{R} ^n \to \mathbb{R} ^m }$]:
Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m .}$ Say ${ f }$ is already ${ C ^2 , }$ that is all partials ${ D _{j _1 } f _i : U \to \mathbb{R} }$ and ${ D _{j _1} D _{j _2} f _i : U \to \mathbb{R} }$ exist and are continuous.
Now ${ f }$ is ${ C ^3 }$ if and only if ${ D ^2 f : U \to L ^2 (\mathbb{R} ^n ; \mathbb{R} ^m) }$ is ${ C ^1 , }$ if and only if ${ D ^2 f : U \to (L ^2 (\mathbb{R} ^n ; \mathbb{R} ^m) , \lVert \ldots \rVert _{\text{Mult } F} ) }$ is ${ C ^1 , }$ if and only if

\[{ U \longrightarrow (\lbrace \text{maps } [n] ^2 \times [m] \to \mathbb{R} \rbrace, \lVert \ldots \rVert _F) }\] \[{ x \longmapsto \left(\begin{align*} (j _1, j _2; i) \mapsto &\, D ^2 f _i (x) (e _{j _1}) (e _{j _2}) \\ &\, = D _{j _1} D _{j _2} f _i (x) \end{align*} \right) }\]

is ${ C ^ 1 , }$ if and only if all partials ${ D _{j _1} D _{j _2} D _{j _3} f _i : U \to \mathbb{R} }$ exist and are continuous.

Thm [${ D ^3 f }$ for ${ C ^3 }$ maps ${ \mathbb{R} ^n \to \mathbb{R} ^m }$]:
Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m }$ be a ${ C ^3 }$ map and ${ x \in U . }$
Then the third derivative

\[{ D ^3 f (x) \in L (\mathbb{R} ^n , L(\mathbb{R} ^n, L(\mathbb{R} ^n, \mathbb{R} ^m))) \cong L ^3 (\mathbb{R} ^n; \mathbb{R} ^m) }\]

is given by its action on basis vectors

\[{ D ^3 f : \mathbb{R} ^n \times \mathbb{R} ^n \times \mathbb{R} ^n \to \mathbb{R} ^m \quad \text{multilinear}, }\] \[{ \boxed{D ^3 f (x) (e _{j _1}) (e _{j _2}) (e _{j _3}) = \begin{pmatrix} D _{j _1} D _{j _2} D _{j _3} f _1 (x) \\ \vdots \\ D _{j _1} D _{j _2} D _{j _3} f _m (x) \end{pmatrix} } }\]

that is

\[{ \begin{align*} &D ^3 f (x) (x _1) (x _2) (x _3) \\ = &\, \sum _{j _1, j _2, j _3 \in [n]} (x _1) _{j _1} (x _2) _{j _2} (x _3) _{j _3} \begin{pmatrix} D _{j _1} D _{j _2} D _{j _3} f _1 (x) \\ \vdots \\ D _{j _1} D _{j _2} D _{j _3} f _m (x) \end{pmatrix}. \end{align*} }\]

Pf: Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m }$ be ${ C ^3 }$ and ${ x \in U . }$
By the definition of derivative, for ${ h }$ in a neighbourhood of ${ 0 , }$

\[{ \begin{align*} &D ^2 f (x + h) = D ^2 f(x) + D ^3 f (x) \, h + \lVert h \rVert \varphi (h) \\ &\, \text{with } \varphi (0) = 0 \text{ and } \varphi \text{ continuous at } 0 . \end{align*} }\]

So (for ${ t _1 }$ in a neighbourhood of ${ 0 }$)

\[{ \begin{align*} &D ^2 f(x + t _1 e _{j _1}) \, (e _{j _2}) (e _{j _3}) \\ = &\, D ^2 f (x) \, (e _{j _2}) (e _{j _3}) + D ^3 f (x) (t _1 e _{j _1}) \, (e _{j _2}) (e _{j _3}) \\ &+ \lVert t _1 e _{j _1} \rVert \varphi (t _1 e _{j _1}) \, (e _{j _2}) (e _{j _3}) \end{align*} }\]

that is

\[{ \begin{align*} &\begin{pmatrix} D _{j _2} D _{j _3} f _1 (x + t _1 e _{j _1}) \\ \vdots \\ D _{j _2} D _{j _3} f _m (x + t _1 e _{j _1}) \end{pmatrix} \\ = &\, \begin{pmatrix} D _{j _2} D _{j _3} f _1 (x) \\ \vdots \\ D _{j _2} D _{j _3} f _m (x) \end{pmatrix} + t _1 D ^3 f (x) (e _{j _1}) (e _{j _2}) (e _{j _3}) \\ &\, + \vert t _1 \vert \varphi (t _1 e _{j _1}) \, (e _{j _2}) (e _{j _3}) . \end{align*} }\]

Dividing by ${ t _1 \neq 0 }$ and letting ${ t _1 \to 0 }$ gives

\[{ D ^3 f (x) (e _{j _1}) (e _{j _2}) (e _{j _3}) = \begin{pmatrix} D _{j _1} D _{j _2} D _{j _3} f _1 (x) \\ \vdots \\ D _{j _1} D _{j _2} D _{j _3} f _m (x) \end{pmatrix} }\]

as needed. ${ \blacksquare }$

Back to top.
\[{ \underline{\textbf{Higher derivatives}} }\]

Def [${ C ^k }$ maps]: Let ${ E, F }$ be complete normed spaces, and ${ f : U (\subseteq E \text{ open}) \to F .}$
We say ${ f }$ is ${ C ^k }$ if the derivatives

\[{ Df : U \longrightarrow L(E, F), }\] \[{ D ^2 f : U \longrightarrow L (E, L(E, F)) \cong L ^2 (E; F), }\] \[{ \vdots }\] \[{ D ^k f : U \longrightarrow \underbrace{L(E, \ldots L(E, F) \ldots )} _{k \text{ many } Es} \cong L ^k (E; F) }\]

exist and are continuous.

Obs [Composition of ${ C ^k }$ maps is ${ C ^k }$]:
Let ${ E, F, G }$ be complete normed spaces. If

\[{ U (\subseteq E \text{ open}) \overset{f}{\longrightarrow} V (\subseteq F \text{ open}) \overset{g}{\longrightarrow} G . }\]

are ${ C ^k }$ maps, then so is the composition ${ g \circ f . }$
Pf: Induction on ${ k . }$
For ${ k= 1 }$ case: Say ${ f, g }$ are ${ C ^1 . }$ By chain rule

\[{ D (g \circ f)(x) = (Dg) (f(x)) \circ (Df) (x) . }\]

We are to show the map ${ x \mapsto D(g \circ f) (x) }$ is continuous. The compositions

\[{ \underbrace{ x} _{\in \, U} \mapsto \underbrace{f(x)} _{\in \, V} \mapsto \underbrace{(Dg)(f(x))} _{\in \, L(F, G)} \mapsto \underbrace{((Dg)(f(x)), 0)} _{\in L(F, G) \times L(E, F) } }\]

and

\[{ \underbrace{x} _{\in \, U} \mapsto \underbrace{(Df)(x)} _{\in \, L(E, F)} \mapsto \underbrace{(0, Df(x))} _{\in \, L(F, G) \times L(E, F)} }\]

are continuous, hence their sum

\[{ \underbrace{x} _{\in \, U} \mapsto \underbrace{((Dg )(f(x)), Df(x))} _{\in \, L(F, G) \times L(E, F)} }\]

is continuous. The map

\[{ L(F, G) \times L(E, F) \longrightarrow L(E, G) , \quad (\alpha, \beta) \mapsto \alpha \circ \beta }\]

is continuous bilinear, hence the composition

\[{ \underbrace{x} _{\in U} \mapsto \underbrace{((Dg )(f(x)), Df(x))} _{\in \, L(F, G) \times L(E, F)} \mapsto \underbrace{(Dg) (f(x)) \circ (Df)(x)} _{\in \, L(E, G)} }\]

is continuous, as needed.

For induction step: Say the theorem statement ${ P(k) }$ is true for some ${ k. }$ We are to show ${ P(k+1) }$ is true.
Let

\[{ U (\subseteq E \text{ open}) \overset{f}{\longrightarrow} V (\subseteq F \text{ open}) \overset{g}{\longrightarrow} G }\]

be ${ C ^{k+1} }$ maps, with the induction hypothesis that composition of (any two compatible) ${ C ^k }$ maps is ${ C ^k . }$ We are to show ${ g \circ f }$ is ${ C ^{k+1}, }$ that is ${ x \mapsto D(g \circ f) (x) }$ is ${ C ^k . }$
The compositions

\[{ \underbrace{ x} _{\in \, U} \mapsto \underbrace{f(x)} _{\in \, V} \mapsto \underbrace{(Dg)(f(x))} _{\in \, L(F, G)} \mapsto \underbrace{((Dg)(f(x)), 0)} _{\in L(F, G) \times L(E, F) } }\]

and

\[{ \underbrace{x} _{\in \, U} \mapsto \underbrace{(Df)(x)} _{\in \, L(E, F)} \mapsto \underbrace{(0, Df(x))} _{\in \, L(F, G) \times L(E, F)} }\]

are ${ C ^k , }$ hence their sum

\[{ \underbrace{x} _{\in \, U} \mapsto \underbrace{((Dg )(f(x)), Df(x))} _{\in \, L(F, G) \times L(E, F)} }\]

is ${ C ^k . }$ The map

\[{ L(F, G) \times L(E, F) \longrightarrow L(E, G) , \quad (\alpha, \beta) \mapsto \alpha \circ \beta }\]

is continuous bilinear, hence the composition

\[{ \underbrace{x} _{\in U} \mapsto \underbrace{((Dg )(f(x)), Df(x))} _{\in \, L(F, G) \times L(E, F)} \mapsto \underbrace{(Dg) (f(x)) \circ (Df)(x)} _{\in \, L(E, G)} }\]

is ${ C ^k , }$ as needed. ${ \blacksquare }$

Obs [${ C ^k }$ maps ${ \mathbb{R} ^n \to \mathbb{R} ^m }$]:
Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m .}$ Then ${ f }$ is a ${ C ^k }$ map if and only if all partials ${ D _{j _1} f _i : U \to \mathbb{R}, }$ ${ D _{j _1} D _{j _2} f _i : U \to \mathbb{R}, }$ ${ \ldots, }$ ${ D _{j _1} \ldots D _{j _k} f _i : U \to \mathbb{R} }$ exist and are continuous.
Pf: Proof by induction. Similar to the proof for ${ C ^2 }$ and ${ C ^3 }$ maps.

Obs [${ D ^k f }$ for ${ C ^k }$ maps ${ \mathbb{R} ^n \to \mathbb{R} ^m }$]:
Let ${ f : U (\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} ^m }$ be a ${ C ^k }$ map and ${ x \in U . }$
Then the ${ k ^{\text{th}} }$ derivative

\[{ D ^k f (x) \in \underbrace{L(\mathbb{R} ^n, \ldots L(\mathbb{R} ^n, \mathbb{R} ^m) \ldots )} _{k \text{ nestings}} \cong L ^k (\mathbb{R} ^n; \mathbb{R} ^m) }\]

is given by its action on basis vectors

\[{ D ^k f (x) : \underbrace{\mathbb{R} ^n \times \ldots \times \mathbb{R} ^n} _{k \text{ many}} \longrightarrow \mathbb{R} ^m \quad \text{ multilinear}, }\] \[{ \boxed{D ^k f(x) (e _{j _1}) \ldots (e _{j _k}) = \begin{pmatrix} D _{j _1} \ldots D _{j _k} f _1 (x) \\ \vdots \\ D _{j _1} \ldots D _{j _k} f _m (x) \end{pmatrix} } }\]

that is

\[{ \begin{align*} D ^k f (x) (x _1) \ldots (x _k) = &\, \sum _{j _1, \ldots, j _k \in [n]} (x _1) _{j _1} \ldots (x _k) _{j _k} \begin{pmatrix} D _{j _1} \ldots D _{j _k} f _1 (x) \\ \vdots \\ D _{j _1} \ldots D _{j _k} f _m (x) \end{pmatrix} . \end{align*} }\]

Pf: Similar to the proof for ${ C ^2 }$ and ${ C ^3 }$ maps.

Back to top.
\[{ \underline{\textbf{Taylor’s Theorem}} }\]

Recall the proof of Taylor’s theorem for real functions (using integration by parts). It generalises as follows.

Thm [Taylor’s theorem]:

Consider complete normed spaces ${ E, F, }$ and a ${ C ^p }$ map ${ f : U (\subseteq E \text{ open}) \to F .}$ Fix an ${ x \in U . }$

Let ${ y \in U }$ be such that the segment ${ [[x, x + y]] = \lbrace x + ty : t \in [0, 1] \rbrace }$ is contained in ${ U . }$ Then, denoting by ${ y ^{(k)} }$ the ${ k -}$tuple ${ (y, \ldots, y), }$ we have

\[{ \begin{align*} &f(x+y) \\ = &\, f(x) + \frac{Df(x) y ^{(1)}}{1!} + \ldots + \frac{D ^p f(x) y ^{(p)} }{p!} + R _p (y) \end{align*} }\]

where

\[{ R _p (y) = \int _0 ^1 \frac{(1-t) ^{p-1}}{(p-1)!} [ D ^p f(x + ty) - D ^p f (x)] y ^{(p)} \, dt . }\]

Lem: If ${ F }$ is a complete normed space and ${ f : [a, b] \to F }$ is regulated, then ${ \lVert f \rVert }$ is regulated and ${ \lVert \int _a ^b f \rVert \leq \int _a ^b \lVert f \rVert .}$
Pf: There are step maps ${ s _n : [a, b] \to F }$ converging uniformly to ${ f .}$ Now ${ \sup _{t \in [a, b]} \vert \lVert s _n (t) \rVert - \lVert f (t) \rVert \vert \leq \lVert s _n - f \rVert \to 0 }$ so step maps ${ \lVert s _n \rVert }$ converge uniformly to ${ \lVert f \rVert . }$ Also each term ${ \lVert \int _a ^b s _n \rVert \leq \int _a ^b \lVert s _n \rVert, }$ so letting ${ n \to \infty }$ gives ${ \lVert \int _a ^b f \rVert \leq \int _a ^b \lVert f \rVert }$ as needed.

Therefore the remainder ${ R _p (y) }$ can be bounded as

\[{ \begin{align*} \lVert R _p (y) \rVert \leq &\, \int _0 ^1 \frac{(1-t) ^{p-1} }{(p-1)!} \lVert [D ^p f(x + ty) - D ^p f(x)] y ^{(p)} \rVert \, dt \\ \leq &\, \sup _{0 \leq t \leq 1} \lVert D ^p f (x + ty) - D ^p f(x) \rVert \lVert y \rVert ^p \int _0 ^1 \frac{(1-t) ^{p-1}}{(p-1)!} \, dt \\ = &\, \sup _{0 \leq t \leq 1} \lVert D ^p f (x + ty) - D ^p f (x) \rVert \frac{\lVert y \rVert ^p}{p!} . \end{align*} }\]

Especially,

\[{ \lim _{y \to 0 } \frac{\lVert R _p (y) \rVert}{\lVert y \rVert ^p} = 0 }\]

that is ${ R _p (y) }$ is ${ o(\lVert y \rVert ^p) . }$

Pf: Consider the continuous bilinear map

\[{ \bullet : F \times \mathbb{R} \to F, \quad v \bullet a = a v . }\]

Using this, we can integrate by parts any product ${ \varphi _1 (t) \bullet D\varphi _2 (t) }$ where ${ \varphi _1 : [0, 1] \to F }$ and ${ \varphi _2 : [0, 1] \to \mathbb{R} }$ are ${ C ^1 }$ maps. So, mimicking the proof for real functions,

\[{ \begin{align*} &f(x + y) \\ &\quad \\ = &\, f(x) + \int _0 ^1 \underbrace{Df(x + ty)y} _{\begin{aligned} &\varphi _1 (t) = (Df \circ \sigma )(t)y ; \\ &\sigma(t) = x + ty \end{aligned} } \bullet \underbrace{1} _{ \begin{aligned} &D\varphi _2 (t); \\ &\varphi _2 (t) = t-1 \end{aligned} } \, dt \\ &\quad \\ = &\, f(x) + Df(x + ty)y \bullet (t - 1) \Bigg\vert _0 ^1 \\ &- \int _0 ^1 D ^2 f (x + ty) y y \bullet (t - 1) \, dt \\ &\quad \\ = &\, f(x) + Df(x) y \\ &- \left( D ^2 f(x + ty) y ^{(2)} \bullet \frac{(t-1) ^2}{2} \Bigg\vert _0 ^1 - \int _0 ^1 D ^3 f(x + ty) y ^{(3)} \bullet \frac{(t-1) ^2}{2} \, dt \right) \\ &\quad \\ = &\, f(x) + Df(x) y + \frac{D ^2 f (x) y ^{(2)}}{2} \\ &+ \int _0 ^1 D ^3 f (x + ty) y ^{(3)} \bullet \frac{(t-1) ^2}{2} \, dt \\ &\quad \\ = &\, f(x) + Df(x) y + \frac{D ^2 f (x) y ^{(2)}}{2} \\ &+ \left( D ^3 f(x + ty) y ^{(3)} \bullet \frac{(t-1) ^3}{2 \cdot 3 } \Bigg\vert _0 ^1 - \int _0 ^1 D ^4 f (x + ty) y ^{(4)} \bullet \frac{(t-1) ^3}{2 \cdot 3 } \, dt \right) \\ &\quad \\ = &\, f(x) + Df(x)y + \frac{D ^2 f (x) y ^{(2)}}{2} + \frac{D ^3 f (x) y ^{(3)} }{2 \cdot 3} \\ &- \int _0 ^1 D ^4 f (x + ty) y ^{(4)} \bullet \frac{(t-1) ^3}{2 \cdot 3 } \, dt \\ &\quad \\ &\quad \vdots \\ &\quad \\ = &\, f(x) + Df(x) y + \frac{D ^2 f (x) y ^{(2)}}{2} + \ldots + \frac{D ^{p-1} f (x) y ^{(p-1)}}{(p-1)!} \\ &+ (-1) ^{p-1} \int _0 ^1 D ^p f(x + ty) y ^{(p)} \bullet \frac{(t-1) ^{p-1}}{(p-1)!} \, dt \\ &\quad \\ = &\, f(x) + Df(x) y + \frac{D ^2 f (x) y ^{(2)}}{2} + \ldots + \frac{D ^{p-1} f (x) y ^{(p-1)}}{(p-1)!} \\ &+ \int _0 ^1 \frac{(1-t) ^{p-1}}{(p-1)!} D ^p f(x + ty ) y ^{(p)} \, dt \\ &\quad \\ = &\, f(x) + Df(x) y + \ldots + \frac{D ^{p-1} f (x) y ^{(p-1)}}{(p-1)!} + \frac{D ^p f(x) y ^{(p)}}{p!} \\ &+ \int _0 ^1 \frac{(1-t) ^{p-1}}{(p-1)!} [ D ^p f(x + ty) - D ^p f(x)] y ^{(p)} \, dt \end{align*} }\]

as needed. ${ \blacksquare }$

Def [Local extrema]:
Let ${ E }$ be a complete normed space, ${ f : U (\subseteq E \text{ open}) \to \mathbb{R} , }$ and ${ p \in U . }$
We say ${ f }$ has a local minimum at ${ p }$ if there is an ${ r > 0 }$ such that

\[{ f(p) \leq f(x) \, \, \text{ for all } x \in B(p, r). }\]

We say ${ f }$ has a strict local minimum at ${ p }$ if there is an ${ r > 0 }$ such that

\[{ f(p) < f(x) \, \, \text{ for all } x \in B(p, r), x \neq p. }\]

Local maximum and strict local maximum are defined similarly.

We see ${ f }$ has a local maximum at ${ p }$ if and only if ${ - f }$ has a local minimum at ${ p ,}$ and ${ f }$ has a strict local maximum at ${ p }$ if and only if ${ - f }$ has a strict local minimum at ${ p . }$

Obs [Necessary condition for local min]:
Let ${ E }$ be a complete normed space, ${ f : U (\subseteq E \text{ open}) \to \mathbb{R} , }$ and ${ p \in U .}$ Let ${ f }$ be differentiable at ${ p .}$ Now

\[{ f \text{ has a local minimum at } p \implies Df(p) = 0 . }\]

Pf: Say ${ f }$ has a local minimum at ${ p . }$ Let ${ h \in E, h \neq 0 . }$ It suffices to show ${ Df(p) \, h = 0 . }$
For some ${ \varepsilon > 0 }$ we have

\[{ g : (- \varepsilon, \varepsilon) \longrightarrow \mathbb{R}, \quad g(t) := f(p + th). }\]

Note ${ g }$ is differentiable at ${ 0 }$ and has a local minimum at ${ 0 .}$ So ${ g ^{’} (0) = 0 }$ that is ${ Df(p) \, h = 0 , }$ as needed.

Thm [Necessary condition for local min]:
Let ${ f : U(\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} }$ be a ${ C ^2 }$ map, and ${ p \in U .}$ Now

\[{ \boxed{\begin{align*} &f \text{ has a local minimum at } p \\ \implies &Df(p) = 0, \, Hf(p) \geq 0 .\end{align*}} }\]

Pf: By Taylor’s theorem, (for ${ h }$ in a neighbourhood of ${ 0 }$)

\[{ \begin{align*} &f(p+h) = f(p) + Df(p) \, h + \frac{1}{2} h ^T Hf(p) \, h + \lVert h \rVert ^2 \varphi (h) \\ &\text{with } \varphi(0) = 0 \text{ and } \varphi \text{ continuous at } 0 . \end{align*} }\]

Say ${ f }$ has a local minimum at ${ p . }$ By previous observation, ${ Df(p) = 0 . }$ Let ${ e }$ be a unit vector in ${ \mathbb{R} ^n . }$ It suffices to show

\[{ \text{To show: } \quad e ^T Hf(p) \, e \geq 0 . }\]

Putting ${ h = te }$ above and using local minimality, (for ${ t }$ in a neighbourhood of ${ 0 }$)

\[{ \begin{align*} f(p+te) - f(p) = &\, \frac{t ^2}{2} e ^T Hf(p) \, e + t ^2 \varphi(te) \geq 0 . \end{align*} }\]

Let ${ \varepsilon > 0 . }$ By continuity of ${ \varphi }$ at ${ 0 , }$ we have for ${ t }$ in a neighbourhood of ${ 0 }$

\[{ \begin{align*} &\frac{t ^2}{2} e ^T Hf(p) \, e + t ^2 \varphi(te) \geq 0 \\ &\text{and } \vert \varphi (t e) \vert < \varepsilon. \end{align*} }\]

For ${ t \neq 0 }$ in this neighbourhood

\[{ \begin{align*} e ^T Hf(p) \, e &\geq -2 \varphi(t e) \\ &> - 2 \varepsilon. \end{align*} }\]

So ${ e ^T Hf(p) \, e > - 2 \varepsilon }$ for every ${ \varepsilon > 0 , }$ that is ${ e ^T Hf(p) \, e \geq 0 }$ as needed. ${ \blacksquare }$

Thm [Sufficient condition for local min]:
Let ${ f : U(\subseteq \mathbb{R} ^n \text{ open}) \to \mathbb{R} }$ be a ${ C ^2 }$ map, and ${ p \in U .}$ Now

\[{ \boxed{\begin{align*} &Df(p) = 0, Hf(p) > 0 \\ \implies &f \text{ has a strict local minimum at } p . \end{align*}} }\]

Pf: Say ${ Df(p) = 0 }$ and ${ Hf(p) > 0 . }$ We are to show ${ f }$ has a strict local minimum at ${ p . }$
By Taylor’s theorem, (for ${ h }$ in a neighbourhood of ${ 0 }$)

\[{ \begin{align*} &f(p+h) = f(p) + \frac{1}{2} h ^T Hf(p) \, h + \lVert h \rVert ^2 \varphi (h) \\ &\text{with } \varphi(0) = 0 \text{ and } \varphi \text{ continuous at } 0 . \end{align*} }\]

The form ${ e \mapsto e ^T Hf(p) \, e > 0 }$ is continuous on the compact unit sphere. So there is a unit vector ${ e _0 }$ such that

\[{ \min _{\lVert e \rVert = 1} e ^T Hf(p) \, e = e _0 ^T Hf(p) \, e _0 > 0 . }\]

Hence (for ${ h }$ in a neighbourhood of ${ 0 }$)

\[{ \begin{align*} &f(p+h) - f(p) \\ = &\, \frac{1}{2} h ^T Hf(p) \, h + \lVert h \rVert ^2 \varphi (h) \\ \geq &\, \frac{1}{2} \lVert h \rVert ^2 (\underbrace{e _0 ^T Hf(p) \, e _0} _{ = \, \alpha > 0 } ) + \lVert h \rVert ^2 \varphi (h ) \\ = &\, \lVert h \rVert ^2 \left( \frac{\alpha}{2} + \varphi(h) \right) . \end{align*} }\]

By continuity of ${ \varphi }$ at ${ 0 ,}$ we have for ${ h }$ in a neighbourhood of ${ 0 }$

\[{ \begin{align*} &f(p+h) - f(p) \geq \lVert h \rVert ^2 \left( \frac{\alpha}{2} + \varphi(h) \right) \\ &\text{and } \vert \varphi (h) \vert < \frac{\alpha}{4}. \end{align*} }\]

For ${ h \neq 0 }$ in this neighbourhood

\[{ \begin{align*} f(p+h) - f(p) \geq &\, \lVert h \rVert ^2 \left( \frac{\alpha}{2} + \varphi (h) \right) \\ > &\, \lVert h \rVert ^2 \left( \frac{\alpha}{2} - \frac{\alpha}{4} \right) \\ > &\, 0 \end{align*} }\]

as needed. ${ \blacksquare }$

Back to top.
\[{ \underline{\textbf{Inverse function theorem}} }\]

Thm [Contraction maps have a unique fixed point]:
Let ${ (X, d) }$ be a complete metric space and ${ T : X \to X .}$ Suppose there is a ${ K \in (0, 1) }$ such that

\[{ d(T(x), T(y)) \leq K d(x, y) \quad \text{ for all } x, y \in X . }\]

Then ${ T }$ has a unique fixed point (i.e. there is a unique ${ p \in X }$ such that ${ T(p) = p }$). Also, for any ${ x \in X }$ the sequence of iterates ${ (T ^n (x)) }$ converges to this fixed point.

Pf: Firstly, if at all a fixed point exists it is unique: Say ${ p _1, p _2 \in X }$ are two fixed points. Now

\[{ \underbrace{d(T(p _1), T(p _2))} _{= d(p _1, p _2)} \leq \underbrace{K} _{\text{in } (0, 1)} d(p _1, p _2) , }\]

so ${ d(p _1, p _2) = 0 }$ that is ${ p _1 = p _2 . }$

Let ${ x \in X . }$ Note the sequence of iterates ${ (T ^n (x)) }$ is Cauchy: For all integers ${ m > n > 0 , }$

\[{ \begin{align*} &d(T ^{m} (x), T ^{n} (x)) \\ \leq &\, K ^n d(T ^{m-n} (x), x) \\ \leq &\, K ^n \left[ d(T ^{m-n} (x), T ^{m-n-1} (x)) + \ldots + d(T(x), x) \right] \\ \leq &\, K ^n (K ^{m-n-1} + \ldots + 1) \, d(T(x), x) \\ = &\, K ^n \frac{1-K ^{m-n} }{1-K} d(T(x), x) \\ = &\, \frac{K ^n - K ^m}{1-K} d(T(x), x) . \end{align*} }\]

By completeness, there is a ${ p \in X }$ such that

\[{ p = \lim _{n \to \infty} T ^n (x) . }\]

Since ${ T }$ is continuous,

\[{ \begin{align*} T(p) = &\, T\, \left( \lim _{n \to \infty} T ^n (x) \right) \\ = &\, \lim _{n \to \infty} T(T ^n (x)) \\ = &\, p \end{align*} }\]

as needed. ${ \blacksquare }$

Basic recall: Invertibility/bijectivity of set maps.

Def [${ C ^p }$ isomorphisms]:
Consider complete normed spaces ${ E, F, }$ and a ${ C ^p }$ map ${ f : U (\subseteq E \text{ open}) \to F . }$
We say ${ f }$ is a ${ C ^p }$ isomorphism if ${ V := f(U) }$ is open and there is a ${ C ^p }$ map ${ g : V \to U }$ with ${ g \circ f = \text{id} _{U} }$ and ${ f \circ g = \text{id} _{V} . }$
(That is, if ${ f }$ maps ${ U }$ bijectively to an open set ${ f(U) , }$ and the inverse ${ f ^{-1} : f(U) \to U }$ is also ${ C ^p }$).
Let ${ x \in U . }$ We say ${ f }$ is a local ${ C ^p }$ isomorphism at ${ x }$ if there is an open neighbourhood ${ x \in U _1 }$ over which ${ f }$ is a ${ C ^p }$ isomorphism.

Obs [${ C ^p }$ isomorphisms]:
Consider complete normed spaces ${ E, F, }$ and a ${ C ^p }$ map

\[{ f : U (\subseteq E \text{ open}) \longrightarrow V (\subseteq F \text{ open}). }\]

Further consider complete normed spaces ${ E _1, F _1, }$ and ${ C ^p }$ isomorphisms

\[{ \lambda : U _1 (\subseteq E _1 \text{ open}) \longrightarrow U (\subseteq E \text{ open}), }\] \[{ \mu : V _1 (\subseteq F _1 \text{ open}) \longrightarrow V (\subseteq F \text{ open}). }\]

In this setup, since composition of ${ C ^p }$ isomorphisms is a ${ C ^p }$ isomorphism,

\[{ \begin{align*} &f : U \to V \text{ is a } C ^p \text{ isomorphism} \\ \iff &\, \mu ^{-1} \circ f \circ \lambda : U _1 \to V _1 \text{ is a } C ^p \text{ isomorphism}. \end{align*} }\]

So when trying to show a map is a ${ C ^p }$ isomorphism, we can consider the problem upto composing by known ${ C ^p }$ isomorphisms.

Def [Toplinear isomorphisms]:
Let ${ E, F }$ be normed spaces, and ${ f : E \to F .}$
We say ${ f }$ is a toplinear isomorphism of it is an isomorphism of vector spaces and an isomorphism of topological spaces.
That is, ${ f }$ is a toplinear isomorphism if ${ f : E \to F }$ is continuous linear and there is a ${ g : F \to E }$ continuous linear with ${ g \circ f = \text{id} _E }$ and ${ f \circ g = \text{id} _F . }$
The set of toplinear isomorphisms ${ E \to F }$ is written

\[{ L _{\text{is}} (E, F) := \lbrace \text{toplinear isomorphisms } E \to F \rbrace . }\]

When ${ E, F }$ are complete, ${ L _{\text{is}} (E, F) }$ is an open subset of ${ L(E, F) . }$

Obs [${ \sum _{k=0} ^{\infty} A ^k }$ as a toplinear isomorphism]:
Let ${ E }$ be a complete normed space and ${ A \in L(E, E) . }$ Now

\[{ \lVert A \rVert < 1 \implies I-A \in L _{\text{is}} (E, E) . }\]

For ${ \lVert A \rVert < 1 }$ the inverse ${ (I-A) ^{-1} }$ ${ = \textstyle \sum _{k=0} ^{\infty} A ^k . }$

Especially ${ \lVert (I - A) ^{-1} \rVert = \lVert \sum _{k=0} ^{\infty} A ^k \rVert }$ ${ \leq \sum _{k = 0} ^{\infty} \lVert A \rVert ^k }$ ${ = (1 - \lVert A \rVert) ^{-1} . }$

Pf: Say ${ \lVert A \rVert < 1 . }$ The sequence of partial sums ${ \sum _{k=0} ^n A ^k }$ is Cauchy, because

\[{ \lVert \textstyle \sum _{k=n} ^{m} A ^k \rVert \leq \textstyle \sum _{k=n} ^{m} \lVert A \rVert ^k \quad (\text{for } m > n) }\]

and the sequence ${ \sum _{k=0} ^{n} \lVert A \rVert ^k }$ is Cauchy.
By completeness, ${ \sum _{k=0} ^{\infty} A ^k }$ converges in ${ L(E, E) . }$

Finally ${ A (\sum _{k = 0} ^{\infty} A ^k) = (\sum _{k = 0} ^{\infty} A ^k) A = (\sum _{k = 0} ^{\infty} A ^k) - I }$ that is

\[{ (I-A) (\textstyle \sum _{k = 0} ^{\infty} A ^k) = (\sum _{k = 0} ^{\infty} A ^k) (I-A) = I , }\]

as needed. ${ \blacksquare }$

Obs [${ L _ {\text{is}} (E, F) }$ is open in ${ L(E, F) }$]:
Let ${ E, F }$ be complete normed spaces. Now ${ L _{\text{is}} (E, F) }$ is an open subset of ${ L(E, F) . }$

For ${ E = F = \mathbb{R} ^n }$ it is clear: The set of toplinear isomorphisms is ${ L _{\text{is}} (\mathbb{R} ^n , \mathbb{R} ^n) = GL _n (\mathbb{R}), }$ which is preimage of ${ \mathbb{R} \setminus \lbrace 0 \rbrace }$ under the determinant map ${ \text{det} : M _n (\mathbb{R}) \to \mathbb{R} . }$

Pf: Let ${ A _0 \in L _{\text{is}} (E, F) . }$ We want a ${ \delta > 0 }$ such that

\[{ \text{Want: } \quad \begin{aligned} &\, A \in L(E, F), \, \lVert A - A _0 \rVert < \delta \\ \implies &\, A \in L _{\text{is}} (E, F) \end{aligned} }\]

Note

\[{ \begin{align*} A = &\, A _0 + A - A _0 \\ = &\, A _0 [ \, \text{id} _{E} + A _0 ^{-1} (A - A _0) \, ] \\ \end{align*} }\]

with ${ \text{id} _E + A _0 ^{-1} (A - A _0) }$ in ${ L (E, E) . }$ It suffices to find a ${ \delta > 0 }$ such that

\[{ \text{Want: } \quad \begin{aligned} &\, A \in L(E, F), \, \lVert A - A _0 \rVert < \delta \\ \implies &\, \text{id} _E + A _0 ^{-1} (A - A _0) \in L _{\text{is}} (E, E) \end{aligned} }\]

Setting ${ \delta := \frac{1}{ 2 \lVert A _0 ^{-1} \rVert} }$ works, because if ${ \lVert A - A _0 \rVert < \delta }$

\[{ \lVert - A _0 ^{-1} (A - A _0) \rVert \leq \lVert A _0 ^{-1} \rVert \lVert A - A _0 \rVert < \frac{1}{2} }\]

ensures ${ \text{id} _E - (- A _0 ^{-1} (A - A _0)) }$ is a toplinear isomorphism. ${ \blacksquare }$

Thm [Derivative of ${ \text{inv} }$]:
Let ${ E, F }$ be complete normed spaces. Then the map

\[{ \text{inv} : L _{\text{is}} (E, F) \longrightarrow L(F, E), \quad A \mapsto A ^{-1} }\]

is infinitely differentiable, with derivative

\[{ D \, \text{inv} : L _{\text{is}} (E, F) \longrightarrow L(L(E, F), L(F, E)) }\]

given by

\[{ (D \, \text{inv})(A) \, H = - A ^{-1} H A ^{-1} }\]

for ${ A \in L _{\text{is}} (E, F) }$ and ${ H \in L(E, F) . }$

For ${ E = F = \mathbb{R} }$ it is clear: ${ (D \, \text{inv}) (a) \, h = - a ^{-2} \, h . }$

Pf: We can first show ${ \text{inv} }$ is continuous.

  • Obs-1: ${ \text{inv} }$ is continuous.

Let ${ A _0 \in L _{\text{is}} (E, F) , }$ and ${ A \in L(E, F) }$ such that ${ \lVert A - A _0 \rVert < \delta = \frac{1}{2\lVert A _0 ^{-1} \rVert} . }$
From previous observation,

\[{ A = A _0 [ \, \text{id} _E + A _0 ^{-1} (A - A _0) ] \quad \text{is in } L _{\text{is}} (E, F) . }\]

Letting ${ \psi (A) := \text{id} _E + A _0 ^{-1} (A - A _0) , }$ we have

\[{ \begin{align*} &\, \text{inv}(A) - \text{inv}(A _0) \\ = &\, \psi(A) ^{-1} A _0 ^{-1} - A _0 ^{-1} \\ = &\, [\psi (A) ^{-1} - \text{id} _E \, ] A _0 ^{-1} \\ = &\, \psi (A) ^{-1} [\, \text{id} _E - \psi (A) ] A _0 ^{-1} \\ = &\, - \psi (A) ^{-1} A _0 ^{-1} (A - A _0) A _0 ^{-1} . \end{align*} }\]

Note ${ \lVert \psi (A) ^{-1} \rVert }$ ${ \leq (1- \lVert - A _0 ^{-1} (A - A _0) \rVert ) ^{-1}. }$ Hence

\[{ \begin{align*} &\, \lVert \text{inv}(A) - \text{inv}(A _0) \rVert \\ \leq &\, \frac{\lVert A _0 ^{-1} \rVert ^2 \lVert A - A _0 \rVert }{1 - \lVert A _0 ^{-1} (A - A _0) \rVert } \\ \leq &\, \frac{\lVert A _0 ^{-1} \rVert ^2 \lVert A - A _0 \rVert}{1 - 1/2} , \end{align*} }\]

giving continuity of ${ \text{inv} }$ at ${ A _0 . }$

  • Obs-2: ${ \text{inv} }$ is differentiable.

Let ${ A _0 \in L _{\text{is}} (E, F) , }$ and ${ H \in L(E, F) }$ such that ${ \lVert H \rVert < \delta = \frac{1}{2\lVert A _0 ^{-1} \rVert} . }$
Rewriting previous observation,

\[{ A _0 + H \quad \text{is in } L _{\text{is}} (E, F) }\]

and

\[{ \begin{align*} &\, \text{inv}(A _0 + H) - \text{inv}(A _0) \\ = &\, - \psi (A _0 + H) ^{-1} A _0 ^{-1} H A _0 ^{-1} . \end{align*} }\]

Therefore

\[{ \begin{align*} &\, \text{inv}(A _0 + H) - \text{inv}(A _0) + A _0 ^{-1} H A _0 ^{-1} \\ = &\, [ \, \text{id} _E - \psi (A _0 + H) ^{-1} ] A _0 ^{-1} H A _0 ^{-1} \\ = &\, \underbrace{[\psi(A _0 + H) - \text{id} _E \, ]} _{= \, A _0 ^{-1} H } \, \underbrace{\psi (A _0 + H) ^{-1} A _0 ^{-1} H A _0 ^{-1}} _{= \, \text{inv}(A _0) - \text{inv}(A _0 + H) } . \end{align*} }\]

Now

\[{ H \mapsto - A _0 ^{-1} H A _0 ^{-1} }\]

gives a continuous linear map ${ L(E, F) \to L(F, E), }$ and the error

\[{ \varepsilon (H) := A _0 ^{-1} H \, [\text{inv}(A _0) - \text{inv}(A _0 + H) ] }\]

satisfies ${ \frac{\varepsilon(H)}{\lVert H \rVert} \to 0 }$ as ${ H \to 0 . }$

Hence ${ \text{inv} }$ is differentiable at ${ A _0, }$ with

\[{ (D \, \text{inv}) (A _0) \, H = - A _0 ^{-1} H A _0 ^{-1} . }\]
  • Obs-3: ${ \text{inv} }$ is infinitely differentiable.

Consider the continuous bilinear maps

\[{ g : L(F, E) ^2 \longrightarrow L(L(E, F), L(F, E)) }\] \[{ g(T _1, T _2) \, S := - T _1 S T _2 }\]

and

\[{ h : L(F, E) \longrightarrow L(F, E) ^2 }\] \[{ h(T) := (T, T). }\]

From previous observation,

\[{ D \, \text{inv} = g \, \circ \, h \, \circ \, \text{inv} . }\]

It is of the form

\[{ D \, \text{inv} = (\text{A } C ^{\infty} \text{ map}) \, \circ \, (\text{A } C ^{\infty} \text{ map}) \, \circ \, \text{inv} }\]

where ${ C ^{\infty} }$ means infinite differentiability.
So for ${ k \geq 0 ,}$ we have

\[{ \text{inv} \text{ is } C ^k \implies D \, \text{inv} \text{ is } C ^k }\]

that is

\[{ \text{inv} \text{ is } C ^k \implies \text{inv} \text{ is } C ^{k+1} . }\]

Since ${ \text{inv} }$ is ${ C ^0, }$ by above induction it is ${ C ^{\infty} , }$ as needed. ${ \blacksquare }$

With these observations, one can study local invertibility of functions.

Thm [Inverse function theorem]:
Consider complete normed spaces ${ E, F, }$ and a ${ C ^p }$ map ${ f : U (\subseteq E \text{ open}) \to F . }$ Let ${ x _0 \in U }$ be such that ${ Df(x _0) : E \to F }$ is a toplinear isomorphism.
Then ${ f }$ is a local ${ C ^p }$ isomorphism at ${ x _0 . }$

\[{ \boxed{\begin{aligned} &\, \textbf{Heuristic: } \text{If } f \text{ is a } C ^p \text{ map, at points where } \\ &\, \text{derivative is nonsingular (in toplinear sense) } \\ &\, f \text{ is a local } C ^p \text{ isomorphism}. \end{aligned} } }\]

Pf: We see ${ Df(x _0) ^{-1} f : U \to E }$ is a ${ C ^p }$ map with derivative at ${ x _0 }$ being ${ \text{id} _E . }$ It suffices to show this map ${ f _1 := Df(x _0) ^{-1} f }$ is a local ${ C ^p }$ isomorphism at ${ x _0 . }$

Let ${ r > 0 }$ be such that ${ B(x _0, r) \subseteq U . }$ It suffices to show ${ f _2 : B(0, r) \to E, }$ ${ f _2 (x) := f _1 (x + x _0) }$ is a local ${ C ^p }$ isomorphism at ${ 0 . }$

It suffices to show the translation ${ \hat{f} : B(0, r) \to E, }$ ${ \hat{f}(x) := f _2 (x) - f _2 (0) }$ is a local ${ C ^p }$ isomorphism at ${ 0 . }$

Here

\[{ \hat{f} : B(0, r) \longrightarrow E, }\] \[{ \hat{f}(x) = Df(x _0) ^{-1} [ f(x + x _0) - f(x _0) ] . }\]

It is a ${ C ^p }$ map, ${ D \hat{f}(0) = \text{id} _E }$ and ${ \hat{f}(0) = 0 . }$ We are to show

\[{ \text{To show: } \quad \hat{f} \text{ is a local } C ^p \text{ isomorphism at } 0 }\]

that is

\[{ \text{To show: } \quad \begin{aligned} &\text{There are open neighbourhoods } U, V \\ &\, \text{of } 0 \text{ such that } \hat{f} \vert _U : U \to V \text{ is a bijection} \\ &\, \text{and } (\hat{f} \vert _U) ^{-1} \text{ is a } C ^p \text{ map.} \end{aligned} }\]

We can proceed by first showing a local bijection.

  • Obs-1: There exist ${ R _1, R _2 > 0 }$ (less than ${ r }$) such that ${ \hat{f} }$ gives a bijection ${ B[0, R _1] \to B[0, R _2] . }$

We want ${ R _1, R _2 > 0 }$ (less than ${ r }$) such that for every ${ y \in B[0, R _2 ] , }$ the map

\[{ g _y : B(0, r) \longrightarrow E, \quad g _y (x) := x - \hat{f}(x) + y }\]

when viewed as a map ${ B[0, R _1] \to E }$ has a unique fixed point.

The maps ${ \hat{f}, g _y }$ (where ${ y \in E }$) are by default defined on ${ B(0, r) . }$

Note ${ g _y (x) = g _0 (x) + y . }$
As ${ Dg _0 (x) = I - D\hat{f}(x) }$ is continuous and ${ Dg _0 (0) = 0, }$ there is an ${ R > 0 }$ (less than ${ r }$) such that

\[{ \lVert x \rVert \leq R \implies \lVert Dg _0 (x) \rVert \leq \frac{1}{2}. }\]

Now for ${ \lVert x \rVert \leq R }$ the term ${ g _0 (x) }$ is bounded as

\[{ \require{cancel} \begin{align*} \lVert g _0 (x) \rVert = &\, \left\lVert \cancel{g _0 (0)} + \int _0 ^1 Dg _0 (tx) \, x \, dt \right\rVert \\ \leq &\, \frac{1}{2} \lVert x \rVert \end{align*} }\]

that is

\[{ \lVert x \rVert \leq R \implies \lVert Dg _0 (x) \rVert \leq \frac{1}{2}, \, \lVert g _0 (x) \rVert \leq \frac{1}{2} \lVert x \rVert . }\]

Hence for ${ \lVert x \rVert \leq R }$ and ${ \lVert y \rVert \leq R/2 , }$

\[{ \begin{align*} \lVert g _y (x) \rVert \leq &\, \lVert g _0 (x) \rVert + \lVert y \rVert \\ \leq &\, \frac{R}{2} + \frac{R}{2} = R . \end{align*} }\]

For every ${ y \in B[0, R/2] , }$ ${ g _y }$ gives a map ${ B[0, R] \to B[0, R]. }$

In fact for every ${ y \in B[0, R/2] , }$ ${ g _y }$ gives a contraction map ${ B[0, R] \to B[0, R]. }$

Let ${ y \in B[0, R/2]. }$ For ${ x _1, x _2 \in B[0, R], }$

\[{ \begin{align*} &\, \lVert g _y (x _2) - g _y (x _1) \rVert \\ = &\, \lVert g _0 (x _2) - g _0 (x _1) \rVert \\ \leq &\, \sup _{x \in [[x _1, x _2]]} \lVert Dg _0 (x) \rVert \, \, \lVert x _1 - x _2 \rVert \\ \leq &\, \frac{1}{2} \lVert x _1 - x _2 \rVert . \end{align*} }\]

So for every ${ y \in B[0, R/2] , }$ there is a unique ${ x \in B[0, R] }$ with ${ x - \hat{f}(x) + y = x . }$

We observed

\[{ \text{Observed: } \quad \begin{aligned} &\, \text{There is an } R > 0 \text{ such that } \hat{f} \\ &\, \text{ gives a bijection } B[0, R] \to B[0, R/2] \end{aligned} }\]

and Obs-1 is true.

Further, note for any ${ 0 < R ^{’} \leq R, }$ the argument which gave the bijection

\[{ B[0, R] \to B[0, R/2], \quad x \mapsto \hat{f}(x) }\]

gives the bijection

\[{ B[0, R ^{’}] \to B[0, R ^{’} /2], \quad x \mapsto \hat{f}(x) . }\]

For the main goal, we want open neighbourhoods ${ U, V }$ of ${ 0 }$ such that ${ \hat{f} }$ maps bijectively ${ U \to V . }$

Fix any ${ 0 < R ^{’} < R . }$

The above bijection ${ B[0, R ^{’}] \to B[0, R ^{’} / 2] }$ gives a bijection

\[{ \lbrace x \in B[0, R ^{’}] : \hat{f}(x) \in B(0, R ^{’} /2) \rbrace \to B(0, R ^{’} / 2) }\]

that is

\[{ B[0, R ^{’}] \cap \hat{f} ^{-1} (B(0, R ^{’} /2)) \to B(0, R ^{’} /2) . }\]

The left hand side is an open neighbourhood of ${ 0 }$:

  • Obs-2: In the bijection ${ B[0, R ^{’}] \to B[0, R ^{’} /2], }$ we have ${ \lVert x \rVert = R ^{’} \implies \lVert \hat{f}(x) \rVert = R ^{’} /2 . }$
    Especially
\[{ \begin{align*} &\, B[0, R ^{’} ] \cap \hat{f} ^{-1} (B(0, R ^{’} /2)) \\ = &\, B(0, R ^{’}) \cap \hat{f} ^{-1} (B(0, R ^{’} /2)) . \end{align*} }\]

Say there is a point ${ x }$ with ${ \lVert x \rVert = R ^{’} }$ and ${ \lVert \hat{f}(x) \rVert < R ^{’} / 2 . }$
As ${ \hat{f} }$ is continuous, there is a ${ \delta > 0 }$ such that ${ B(x, \delta) \subseteq B(0, R) }$ and

\[{ t \in B(x, \delta) \implies \lVert \hat{f}(t) \rVert < R ^{’} / 2 . }\]

Pick a point ${ x ^{\ast} \in B(x, \delta) }$ of norm ${ \lVert x ^{\ast} \rVert > \lVert x \rVert . }$ (For example, ${ x + \frac{x}{\lVert x \rVert} \frac{\delta}{2} .}$ ) Now

\[{ R ^{’} = \lVert x \rVert < \lVert x ^{\ast} \rVert < R \quad \text{ and } \quad \lVert \hat{f}(x ^{\ast}) \rVert, \lVert \hat{f}(x) \rVert < R ^{’} / 2 . }\]

The ${ B[0, R] \to B[0, R/2] }$ preimage of ${ \hat{f}(x ^{\ast}) }$ is ${ x ^{\ast} . }$ But the bijection ${ B[0, R ^{’} ] \to B[0, R ^{’} / 2] }$ says this preimage must lie in ${ B[0, R ^{’}] , }$ a contradiction.
Hence Obs-2 is true.

Finally

\[{ U := B(0, R ^{’}) \cap \hat{f} ^{-1} (B(0, R ^{’} /2)), }\] \[{ V := B(0, R ^{’} / 2) }\]

are open neighbourhoods of ${ 0 }$ such that

\[{ U \to V, \quad x \mapsto \hat{f}(x) }\]

is a bijection.

  • Obs-3: The map ${ V \to U, y \mapsto \hat{f} ^{-1} (y) }$ is continuous.

It suffices to show inverse of the bijection ${ B[0, R] \to B[0, R/2] }$ is continuous.

Any ${ x \in B[0, R] }$ can be written as

\[{ \begin{align*} x = &\, g _0 (x) + (x - g _0 (x)) \\ = &\, g _0 (x) + \hat{f}(x) . \end{align*} }\]

So for all ${ x _1, x _2 \in B[0, R] , }$ we see

\[{ \begin{align*} &\, \lVert x _1 - x _2 \rVert \\ \leq &\, \lVert g _0 (x _1) - g _0 (x _2) \rVert + \lVert \hat{f}(x _1) - \hat{f}(x _2) \rVert \\ \leq &\, \frac{1}{2} \lVert x _1 - x _2 \rVert + \lVert \hat{f}(x _1) - \hat{f}(x _2) \rVert \end{align*} }\]

that is

\[{ \frac{1}{2} \lVert x _1 - x _2 \rVert \leq \lVert \hat{f} (x _1) - \hat{f} (x _2) \rVert . }\]

Rewriting this, for all ${ y _1, y _2 \in B[0, R/2], }$

\[{ \frac{1}{2} \lVert \hat{f} ^{-1} (y _1) - \hat{f} ^{-1} (y _2) \rVert \leq \lVert y _1 - y _2 \rVert . }\]

Hence Obs-3 is true.

  • Obs-4: The map ${ \mathfrak{F} : V \to U, y \mapsto \hat{f} ^{-1} (y) }$ is differentiable.

Firstly, for every ${ x \in U }$ the derivative ${ D \hat{f}(x) }$ is a toplinear isomorphism (${ \lVert Dg _0 (x) \rVert }$ ${ = \lVert I - D\hat{f}(x) \rVert }$ ${ \leq \frac{1}{2}}$ so ${ I - (I - D\hat{f}(x)) }$ is invertible).

Let ${ y _0 \in V }$ and ${ x _0 := \hat{f} ^{-1} (y _0) . }$ We will show

\[{ \text{Will show: } \quad (D\mathfrak{F})(y _0) = (D \hat{f} )(x _0) ^{-1} . }\]

Let variable ${ y \in V }$ and ${ x := \hat{f} ^{-1} (y) . }$ We have (for ${ x }$ in a neighbourhood of ${ x _0 }$)

\[{ \begin{align*} &\, \hat{f}(x) = \hat{f}(x _0) + D\hat{f}(x _0) \, (x - x _0) + \lVert x - x _0 \rVert \varphi (x - x _0) \\ &\, \text{with } \varphi(0) = 0 \text{ and } \varphi \text{ continuous at } 0. \end{align*} }\]

Rewriting this,

\[{ x = x _0 + D \hat{f} (x _0) ^{-1} \, [\hat{f}(x) - \hat{f} (x _0) - \lVert x - x _0 \rVert \varphi (x - x _0) ] }\]

that is

\[{ \begin{align*} \hat{f} ^{-1} (y) = &\, \hat{f} ^{-1} (y _0) + D \hat{f} (x _0) ^{-1} \, (y - y _0) \\ &\, - \lVert x - x _0 \rVert D \hat{f}(x _0) ^{-1} \varphi (x - x _0) . \end{align*} }\]

Note

\[{ \frac{\lVert x - x _0 \rVert D \hat{f}(x _0) ^{-1} \varphi (x - x _0)}{\lVert y - y _0 \rVert} \to 0 \text{ as } y \to y _0 }\]

because ${ \lVert x - x _0 \rVert \leq 2 \lVert y - y _0 \rVert }$ from previous observation.

Finally

\[{ (D\mathfrak{F}) (y _0) = D\hat{f}(x _0) ^{-1} }\]

and Obs-4 is true.

  • Obs-5: The map ${ \mathfrak{F} : V \to U, y \mapsto \hat{f} ^{-1} (y) }$ is ${ C ^p . }$

From previous observation,

\[{ D \mathfrak{F} = \text{inv} \, \circ \, D \hat{f} \, \circ \, \mathfrak{F} . }\]

It is of the form

\[{ D \mathfrak{F} = (\text{A } C ^{\infty} \text{ map}) \, \circ \, (\text{A } C ^{p-1} \text{ map}) \, \circ \, \mathfrak{F} . }\]

So for ${ 0 \leq k \leq p-1 , }$ we have

\[{ \mathfrak{F} \text{ is } C ^k \implies D \mathfrak{F} \text{ is } C ^k }\]

that is

\[{ \mathfrak{F} \text{ is } C ^k \implies \mathfrak{F} \text{ is } C ^{k+1} . }\]

Since ${ \mathfrak{F} }$ is ${ C ^0 , }$ by above induction it is ${ C ^p , }$ as needed. ${ \blacksquare }$

Back to top.
\[{ \underline{\textbf{Implicit function theorem}} }\]

Consider complete normed spaces ${ E, F, G , }$ and a ${ C ^p }$ map

\[{ f : U (\subseteq E \times F \text{ open}) \longrightarrow G . }\]

It has a zero set

\[{ f ^{-1} (0) = Z _f = \lbrace (x, y) : (x, y) \in U, f(x, y) = 0 \rbrace . }\]

Let ${ (a, b) \in Z _f . }$ The goal is to study the structure of ${ Z _f }$ near ${ (a, b) .}$
We can ask ourselves: When does ${ Z _f }$ look like the graph of a function near ${ (a, b) }$?

Obs [Open subsets of ${ E \times F }$]:
Let ${ E, F }$ be complete normed spaces. Unless otherwise mentioned, ${ E \times F }$ is equipped with the sup norm. Now for ${ (a, b) \in E \times F }$ and ${ \delta > 0 , }$

\[{ \begin{align*} &\, B((a, b), \delta) \\ = &\, \lbrace (x, y): \max \lbrace \lVert x - a \rVert, \lVert y - b \rVert \rbrace < \delta \rbrace \\ = &\, \lbrace (x, y) : \lVert x - a \rVert < \delta , \lVert y - b \rVert < \delta \rbrace \\ = &\, B(a, \delta) \times B(b, \delta) . \end{align*} }\]

Unions of such open balls make up the open subsets of ${ E \times F . }$

Obs: Let ${ E, F }$ be complete normed spaces, and ${ U, V }$ be open subsets of ${ E, F }$ respectively. Then ${ U \times V }$ is an open subset of ${ E \times F . }$
Pf: Let ${ (a, b) \in U \times V . }$ There are ${ \delta _1, \delta _2 > 0 }$ such that ${ B(a, \delta _1) \subseteq U }$ and ${ B(b, \delta _2) \subseteq V . }$ Considering ${ \delta = \min \lbrace \delta _1, \delta _2 \rbrace > 0 , }$ the ball

\[{ \begin{align*} B((a, b), \delta) = &\, B(a, \delta) \times B(b, \delta) \\ \subseteq &\, B(a, \delta _1) \times B(b, \delta _2) \\ \subseteq &\, U \times V \end{align*} }\]

as needed. ${ \blacksquare }$

Def [Partial derivatives]:
Consider complete normed spaces ${ E, F, G, }$ open subsets ${ U \subseteq E, }$ ${ V \subseteq F , }$ and a map

\[{ f : U \times V \, (\subseteq E \times F \text{ open}) \longrightarrow G . }\]

Let ${ (a, b) \in U \times V . }$ If the section

\[{ f ^{b} = f(\cdot, b) : U \longrightarrow G, \quad x \mapsto f(x, b) }\]

is differentiable at ${ a , }$ the derivative ${ (Df ^{b} )(a) \in L(E, G) }$ is written as ${ \partial _1 f(a, b) . }$
Similarly if the section

\[{ f ^{a} = f(a, \cdot) : V \longrightarrow G, \quad y \mapsto f(a, y) }\]

is differentiable at ${ b, }$ the derivative ${ (D f ^{a}) (b) \in L(F, G) }$ is written as ${ \partial _2 f (a, b) . }$

Thm [${ C ^p }$ maps ${ E \times F \to G }$]:
Consider complete normed spaces ${ E, F, G, }$ open subsets ${ U \subseteq E, }$ ${ V \subseteq F , }$ and a map

\[{ f : U \times V \, (\subseteq E \times F \text{ open}) \longrightarrow G . }\]

Now ${ f }$ is a ${ C ^p }$ map if and only if the partials

\[{ \partial _1 f : U \times V \longrightarrow L(E, G), }\] \[{ \partial _2 f : U \times V \longrightarrow L(F, G) }\]

exist and are ${ C ^{p-1} . }$
In this case, for ${ (a, b) \in U \times V }$ the derivative ${ (Df)(a, b) \in L(E \times F, G) }$ is given by

\[{ (Df)(a, b) \, (h, k) = \partial _1 f (a, b) \, h + \partial _2 f (a, b) \, k . }\]

Pf: ${ \underline{\Rightarrow} }$ Say ${ f }$ is ${ C ^p . }$ We are to show the partials ${ \partial _1 f, }$ ${ \partial _2 f }$ exist and are ${ C ^{p-1} . }$

Let ${ (a, b) \in U \times V . }$ By definition of derivative, for ${ (h, k) }$ in a neighbourhood of ${ (0,0), }$

\[{ \begin{aligned} &\, f(a+h, b+k) = f(a, b) + Df(a, b) \, (h, k) + \lVert (h, k) \rVert \varphi(h, k) \\ &\, \text{with } \varphi(0, 0) = 0 \text{ and } \varphi \text{ continuous at } (0, 0) . \end{aligned} }\]

Especially for ${ h }$ in a neighbourhood of ${ 0 , }$

\[{ \begin{aligned} &\, f(a+h, b) = f(a, b) + Df(a, b) \, (h, 0) + \lVert h \rVert \varphi (h, 0) \end{aligned} }\]

so the partial derivative ${ \partial _1 f(a, b) \in L(E, G) }$ is given by ${ h \mapsto Df(a, b) \, (h, 0). }$

Similarly the partial derivative ${ \partial _2 f(a, b) \in L(F, G) }$ is given by ${ k \mapsto Df (a, b) \, (0, k) . }$

With this, ${ \partial _1 f(a, b) }$ and ${ \partial _2 f(a, b) }$ exist, and

\[{ Df (a, b) \, (h, k) = \partial _1 f (a, b) \, h + \partial _2 f (a, b) \, k . }\]

It is left to show that

\[{ \partial _1 f : \underbrace{(x, y)} _{\in \, U \times V} \mapsto \underbrace{Df(x, y) \, (\cdot \, , \, 0)} _{\in \, L(E, G)}, }\] \[{ \partial _2 f : \underbrace{(x, y)} _{\in \, U \times V} \mapsto \underbrace{Df(x, y) \, (0 \, , \, \cdot)} _{\in \, L(F, G)} }\]

are ${ C ^{p-1} }$ maps. Since

\[{ \alpha _1 : L(E \times F, G) \longrightarrow L(E, G), \quad \alpha _1 (\varphi) = \varphi(\cdot \, , \, 0) }\] \[{ \alpha _2 : L(E \times F, G) \longrightarrow L(F, G), \quad \alpha _2 (\varphi) = \varphi(0 , \, \cdot) }\]

are continuous linear, the compositions

\[{ \partial _1 f = \alpha _1 \, \circ \, Df , }\] \[{ \partial _2 f = \alpha _2 \, \circ Df }\]

are ${ C ^{p-1} }$ maps, as needed.

${ \underline{\Leftarrow} }$ Say the partials

\[{ \partial _1 f : U \times V \longrightarrow L(E, G), }\] \[{ \partial _2 f : U \times V \longrightarrow L(F, G) }\]

exist and are ${ C ^{p-1} . }$ We are to show ${ f }$ is a ${ C ^p }$ map.

Let ${ (a, b) \in U \times V . }$ For ${ (h, k) }$ in a neighbourhood of ${ 0 , }$

\[{ \begin{aligned} &\, f(a+h, b+k) - f(a, b) - \partial _1 f(a, b) \, h - \partial _2 f(a, b) \, k \\ = &\, f(a+h, b+k) - f(a, b+k) \\ &\, + f(a, b+k) - f(a, b) \\ &\, - \partial _1 f(a, b) \, h - \partial _2 f(a, b) \, k \\ = &\, \int _0 ^1 \partial _1 f(a+sh, b+k) \, h \, ds \\ &\, + \int _0 ^1 \partial _2 f(a, b + tk) \, k \, dt \\ &\, - \partial _1 f(a, b) \, h - \partial _2 f(a, b) \, k \\ = &\, \int _0 ^1 [\partial _1 f(a + sh, b+k) - \partial _1 f(a, b)] \, h \, ds \\ &\, + \int _0 ^1 [\partial _2 f(a, b+tk) - \partial _2 f(a, b)] \, k \, dt . \end{aligned} }\]

Calling this error ${ \varepsilon (h, k) , }$ we have

\[{ \begin{aligned} &\, \lVert \varepsilon(h, k) \rVert \\ \leq &\, \max _{0 \leq s \leq 1} \lVert \partial _1 f (a + sh, b + k) - \partial _1 f(a, b) \rVert \lVert h \rVert \\ &\, + \max _{0 \leq t \leq 1} \lVert \partial _2 f (a, b+tk) - \partial _2 f (a, b) \rVert \lVert k \rVert \\ \leq &\, \left( {\begin{aligned} &\, \max _{0 \leq s \leq 1} \lVert \partial _1 f (a + sh, b + k) - \partial _1 f(a, b) \rVert \\ &\, + \max _{0 \leq t \leq 1} \lVert \partial _2 f (a, b+tk) - \partial _2 f (a, b) \rVert \end{aligned}} \right) \max \lbrace \lVert h \rVert , \lVert k \rVert \rbrace . \end{aligned} }\]

Now

\[{ (h, k) \mapsto \partial _1 f(a, b) \, h + \partial _2 f (a, b) \, k }\]

gives a continuous linear map ${ E \times F \to G , }$ and the error ${ \varepsilon(h, k) }$ satisfies ${ \varepsilon(0, 0) = 0 }$ and

\[{ \frac{\lVert \varepsilon(h, k) \rVert}{\max \lbrace \lVert h \rVert, \lVert k \rVert \rbrace} \to 0 \quad \text{ as } \, \, \max \lbrace \lVert h \rVert, \lVert k \rVert \rbrace \to 0 . }\]

Hence the ${ f }$ is differentiable at ${ (a, b) , }$ with

\[{ Df(a, b) \, (h, k) = \partial _1 f(a, b) \, h + \partial _2 f(a, b) \, k . }\]

It is left to show that ${ f }$ is a ${ C ^p }$ map, that is ${ (x, y) \mapsto Df(x, y) }$ is ${ C ^{p-1} . }$ The compositions

\[{ \underbrace{(x, y)} _{\in \, U \times V} \mapsto \underbrace{\partial _1 f(x, y)} _{\in \, L(E, G)} \mapsto \underbrace{(\partial _1 f (x, y), 0) } _{\in L(E, G) \times L(F, G)} }\] \[{ \underbrace{(x, y)} _{\in \, U \times V} \mapsto \underbrace{\partial _2 f(x, y)} _{\in \, L(F, G)} \mapsto \underbrace{(0, \partial _2 f (x, y)) } _{\in L(E, G) \times L(F, G)} }\]

are ${ C ^{p-1}, }$ hence their sum

\[{ \underbrace{(x, y)} _{\in \, U \times V} \mapsto \underbrace{(\partial _1 f (x, y), \partial _2 f (x, y))} _{\in \, L(E, G) \times L(F, G)} }\]

is ${ C ^{p-1} . }$ The map

\[{ \alpha : L(E, G) \times L(F, G) \longrightarrow L(E \times F, G), }\] \[{ \alpha (\varphi _1, \varphi _2) \, (h, k) = \varphi _1 (h) + \varphi _2 (k) }\]

is continuous linear, hence the composition

\[{ \underbrace{(x, y)} _{\in \, U \times V} \mapsto \underbrace{(\partial _1 f (x, y), \partial _2 f (x , y))} _{\in \, L(E, G) \times L(F, G) } \mapsto \underbrace{\alpha(\partial _1 f(x, y), \partial _2 f(x, y))} _{\in \, L(E \times F, G)} }\]

is ${ C ^{p-1}, }$ as needed. ${ \blacksquare }$

Consider complete normed spaces ${ E, F, G , }$ and a ${ C ^p }$ map

\[{ f : U (\subseteq E \times F \text{ open}) \longrightarrow G . }\]

It has a zero set

\[{ Z _f = \lbrace (x, y) : (x, y) \in U, f(x, y) = 0 \rbrace . }\]

Let ${ (a, b) \in Z _f . }$ The goal is to study the structure of ${ Z _f }$ near ${ (a, b) .}$

Informally, for small ${ \delta > 0 , }$ we have ${ B((a, b), \delta) \subseteq U }$ and

\[{ \require{cancel} \begin{aligned} &\, Z _f \cap B((a, b), \delta) \\ = &\, \lbrace (a+h, b+k) : \lVert h \rVert < \delta, \lVert k \rVert < \delta, f(a+h, b+k) = 0 \rbrace \\ \approx &\, \lbrace (a+h, b+k) : \lVert h \rVert < \delta, \lVert k \rVert < \delta, \cancel{f(a, b)} + \partial _1 f (a, b) \, h + \partial _2 f (a, b) \, k = 0 \rbrace \\ = &\, \lbrace (x, y) : \lVert x - a \rVert < \delta, \lVert y - b \rVert < \delta, \, \partial _1 f (a, b) \, (x - a) + \partial _2 f (a, b) \, (y-b) = 0 \rbrace . \end{aligned} }\]

Now if ${ \partial _2 f(a, b) }$ is invertible, the approximating set looks like the graph of

\[{ E \longrightarrow F , }\] \[{ x \mapsto y = b - [\partial _2 f(a, b) ] ^{-1} \partial _1 f (a, b) (x - a) }\]

within ${ B(a, \delta) \times B(b, \delta) . }$ This suggests the following.

Thm [Implicit function theorem]:
Consider complete normed spaces ${ E, F, G , }$ and a ${ C ^p }$ map

\[{ f : U (\subseteq E \times F \text{ open}) \longrightarrow G . }\]

It has a zero set

\[{ Z _f = \lbrace (x, y) : (x, y) \in U, f(x, y) = 0 \rbrace . }\]

Let ${ (a, b) \in Z _f }$ be such that ${ \partial _2 f(a, b) : F \to G }$ is a toplinear isomorphism.
Then there exist open neighbourhoods

\[{ (a, b) \in \mathscr{U} \subseteq U, \quad a \in \mathscr{V} \subseteq E }\]

and a ${ C ^p }$ map

\[{ g : \mathscr{V} \longrightarrow F }\]

such that

\[{ Z _f \cap \mathscr{U} = (\text{graph of } g) }\]

that is

\[{ \lbrace (x, y) \in \mathscr{U} : f(x, y) = 0 \rbrace = \lbrace (x, g(x)) : x \in \mathscr{V} \rbrace . }\]

Further,

\[{ Dg(x) = - [\partial _2 f \, (x, g(x))] ^{-1} \partial _1 f \, (x, g(x)) }\]

over a neighbourhood of ${ a . }$

\[{ \boxed{\begin{aligned} &\, \textbf{Heuristic:} \text{ If } f : U (\subseteq E \times F) \to G \text{ is a } C ^p \text{ map, } \\ &\, \text{near any point of } Z _f \text{ where } \partial _2 f \text{ is nonsingular, } Z _f \\ &\, \text{looks like the graph of a } C ^p \text{ map } V (\subseteq E) \to F . \end{aligned} } }\]

Pf: Since ${ \partial _2 f (a, b) : F \to G }$ is a toplinear isomorphism, consider the map

\[{ \hat{f} = [\partial _2 f (a, b) ] ^{-1} f : U (\subseteq E \times F ) \longrightarrow F . }\]

It is a ${ C ^p }$ map with ${ Z _{\hat{f}} = Z _f }$ and ${ \partial _2 \hat{f} (a, b) = \text{id} _F . }$

Rewriting the goal, we want open neighbourhoods

\[{ (a, b) \in \mathscr{U} \subseteq U, \quad a \in \mathscr{V} \subseteq E }\]

and a ${ C ^p }$ map

\[{ g : \mathscr{V} \longrightarrow F }\]

such that

\[{ Z _{\hat{f}} \cap \mathscr{U} = (\text{graph of } g). }\] \[{ }\]

In an attempt to “complete” the map

\[{ \hat{f} : U (\subseteq E \times F) \longrightarrow F }\]

to a map

\[{ U (\subseteq E \times F) \longrightarrow E \times F }\]

which is locally invertible at ${ (a, b) }$ i.e. has nonsingular derivative at ${ (a, b), }$ we can consider

\[{ \varphi : U (\subseteq E \times F \text{ open}) \longrightarrow E \times F, }\] \[{ \varphi (x, y) = (x, \hat{f}(x, y)) . }\]

By a usual composition argument, it is a ${ C ^p }$ map. The derivative

\[{ (D \varphi) (a, b) \in L(E \times F, E \times F) }\]

is the derivative of the sum of ${ (x, y) \mapsto (x, 0) }$ and ${ (x, y) \mapsto (0, \hat{f}(x, y)) }$ at the point ${ (a, b), }$ and so is given by

\[{ \begin{aligned} &\, (D \varphi) (a, b) \, (h, k) \\ = &\, (h, D \hat{f} (a, b) \, (h, k) ) \\ = &\, (h, \partial _1 \hat{f} (a, b) \, h + \partial _2 \hat{f} (a, b) \, k ) \\ = &\, (h, \partial _1 \hat{f}(a, b) \, h + k ) \end{aligned} }\]

that is

\[{ (D\varphi) (a, b) \, \begin{pmatrix} h \\ k \end{pmatrix} = \begin{pmatrix} \text{id} _E &0 \\ \partial _1 \hat{f}(a, b) &\text{id} _F \end{pmatrix} \begin{pmatrix} h \\ k \end{pmatrix} . }\]

The continuous linear map

\[{ (D\varphi) (a, b) = \begin{pmatrix} \text{id} _E &0 \\ \partial _1 \hat{f}(a, b) &\text{id} _F \end{pmatrix} \in L(E \times F, E \times F) }\]

has a continuous linear inverse

\[{ \begin{pmatrix} \text{id} _E &0 \\ -\partial _1 \hat{f} (a, b) &\text{id} _F \end{pmatrix} \in L(E \times F, E \times F) , }\]

hence ${ D\varphi (a, b) }$ is nonsingular as needed.

By inverse function theorem, ${ \varphi : U (\subseteq E \times F) \longrightarrow E \times F }$ is a local ${ C ^p }$ isomorphism at ${ (a, b) . }$ There exist open neighbourhoods

\[{ (a, b) \in W \subseteq U, \quad (a, 0) \in W ^{’} \subseteq E \times F }\]

such that ${ \varphi \big\vert _{W } : W \longrightarrow W ^{’} }$ is a ${ C ^p }$ isomorphism.

Let ${ \psi = \left(\varphi \big\vert _{W } \right) ^{-1} : W ^{’} \longrightarrow W . }$ It is of the form

\[{ \psi (\mathbf{x}, \mathbf{y} ) = (\psi _1 (\mathbf{x}, \mathbf{y}), \psi _2 (\mathbf{x} , \mathbf{y})) \quad \text{ for } (\mathbf{x}, \mathbf{y}) \in W ^{’} }\]

where ${ \psi _1 : W ^{’} \to E , }$ ${ \psi _2 : W ^{’} \to F }$ are ${ C ^p }$ maps. It has a more specific structure. We have

\[{ \begin{aligned} &\, (\mathbf{x}, \mathbf{y}) \\ = &\, (\varphi \circ \psi)(\mathbf{x}, \mathbf{y}) \\ = &\, (\psi _1 (\mathbf{x}, \mathbf{y}), \, \, \hat{f}(\psi _1 (\mathbf{x}, \mathbf{y}), \psi _2 (\mathbf{x}, \mathbf{y}))) \end{aligned} }\]

that is

\[{ \mathbf{x} = \psi _1 (\mathbf{x}, \mathbf{y}), \quad \mathbf{y} = \hat{f}(\mathbf{x}, \psi _2 (\mathbf{x}, \mathbf{y})) }\]

for all ${ (\mathbf{x}, \mathbf{y}) \in W ^{’} . }$

Especially

\[{ (\mathbf{x}, 0) \in W ^{’} \implies 0 = \hat{f}(\underbrace{\mathbf{x}, \psi _2 (\mathbf{x}, 0)} _{\psi(\mathbf{x}, 0) \in W } ), }\]

that is points ${ (\mathbf{x}, 0) \in W ^{’} }$ give points ${ \psi (\mathbf{x}, 0) = (\mathbf{x}, \psi _2 (\mathbf{x}, 0)) \in Z _{\hat{f}} \cap W . }$

Since

\[{ \lbrace x \in E : (x, 0) \in W ^{’} \rbrace }\]

is an open neighbourhood of ${ a, }$ this suggests that setting

\[{ \mathscr{U} := W, \quad \mathscr{V} := \lbrace x \in E : (x, 0) \in W ^{’} \rbrace }\]

and

\[{ g : \mathscr{V} \longrightarrow F, \quad g(x) = \psi _2 (x, 0) }\]

works.

Indeed, the graph of ${ g }$ looks like

\[{ \begin{aligned} &\, (\text{graph of } g) \\ = &\, \lbrace (x, \psi _2 (x, 0)) : (x, 0) \in W ^{’} \rbrace \\ \subseteq &\, Z _{\hat{f}} \, \cap W , \end{aligned} }\]

and it is left to show that every point of ${ Z _{\hat{f}} \cap W }$ lies in the graph of ${ g . }$

We have

\[{ \begin{aligned} &\, (x, y) \\ = &\, (\psi \circ \varphi)(x, y) \\ = &\, (x, \, \psi _2 (x, \hat{f}(x, y))) \end{aligned} }\]

for all ${ (x, y) \in W , }$ that is

\[{ y = \psi _2 (x, \hat{f}(x, y)) \quad \text{ for all } (x, y) \in W . }\]

Especially

\[{ y = \psi _2 (x, 0) \quad \text{ for all } (x, y) \in Z _{\hat{f}} \cap W . }\]

Finally

\[{ \begin{aligned} &\, (x, y) \in Z _{\hat{f}} \cap W \\ \implies &\, y = \psi _2 (x, 0), \, \, \underbrace{\varphi(x, y)} _{ (x, 0)} \in W ^{’} \end{aligned} }\]

that is

\[{ Z _{\hat{f}} \cap W \subseteq (\text{graph of } g) , }\]

as needed.

The second part, on derivative of the implicit function ${ g , }$ follows from chain rule. Let ${ A(x) = \hat{f} (x, g(x)) }$ be the composition

\[{ \underbrace{x} _{\in \, \mathscr{V}} \overset{\alpha}{\longmapsto} \underbrace{(x, g(x))} _{\in \, U } \overset{\hat{f}}{\longmapsto} \underbrace{\hat{f}(x, g(x))} _{\in \, F} . }\]

It is identically zero. Hence

\[{ \begin{aligned} &\, DA(x) \, h \\ = &\, (D\hat{f})(\alpha(x)) \circ (D \alpha)(x) \, h \\ = &\, (D\hat{f}) (\alpha(x)) \, (h, Dg(x) \, h) \\ = &\, \partial _1 \hat{f} (\alpha(x)) \, h + \partial _2 \hat{f} (\alpha(x)) \, Dg(x) \, h \\ = &\, \partial _1 \hat{f} (x, g(x)) \, h + \partial _2 \hat{f} (x, g(x)) \, Dg(x) \, h \\ = &\, 0 , \end{aligned} }\]

that is

\[{ \begin{aligned} &\, \partial _1 \hat{f} (x, g(x)) \, \text{id} _E + \partial _2 \hat{f} (x, g(x)) \, Dg(x) = 0 \\ &\, \text{for all } x \in \mathscr{V} . \end{aligned} }\]

The ${ C ^{p-1} }$ map

\[{ \mathscr{V} \longrightarrow L(F, F) , }\] \[{ x \longmapsto \partial _2 \hat{f} (x, g(x)) }\]

sends ${ a }$ to ${ \text{id} _F, }$ an element of the open set ${ L _{\text{is}} (F, F) . }$ So there is a neighbourhood of ${ a }$ (contained in ${ \mathscr{V} }$) over which ${ \partial _2 \hat{f} (x, g(x)) \in L _{\text{is}} (F, F) . }$

For ${ x }$ in this neighbourhood,

\[{ \begin{aligned} Dg(x) = &\, - [\partial _2\hat{f} (x, g(x))] ^{-1} \partial _1 \hat{f} (x, g(x)) \\ = &\, - [\partial _2 f \, (x, g(x))] ^{-1} \partial _1 f \, (x, g(x)) \end{aligned} }\]

as needed. ${ \blacksquare }$

comments powered by Disqus