Blog (mostly math)

Kernel Density Estimators

Ref:

  • “Data Analysis for Social Scientists” by Duflo, Ellison. Lec-5. Link to the lecture: Link.
  • Scikit-learn’s Density estimation documentation. Link to the page: Link.
  • “Introduction to Mathematical Statistics” by Hogg, McKean, Craig.
  • “Introduction to Nonparametric Estimation” by Tsybakov.
  • “Smoothing methods in Statistics” by Simonoff.

Consider data ${ x _1, \ldots, x _n }$ sampled from a random variable ${ X }$ with density ${ f(x) . }$

Q) Can we approximate ${ f(x) }$ from the sample?

Fix a bandwidth ${ h > 0 . }$ Consider the interval ${ (x - h, x + h) . }$

Note that

\[{ \mathbb{P}(x - h < X < x + h) \approx f(x) 2h . }\]

Note that

\[{ {\begin{aligned} &\, \mathbb{P}(x - h < X < x + h) \\ = &\, \mathbb{P} \left(-1 < \frac{X - x}{h} < 1 \right) \\ \approx &\, \frac{1}{n} \sum _{i = 1} ^{n} I \left( \frac{x _i - x}{h} \right) \end{aligned}} }\]

where ${ I(x) = \mathbb{1}(x \in [-1, 1]). }$

Hence we have an estimator

\[{ \hat{f} _1 (x) = \frac{1}{2nh} \sum _{i = 1} ^{n} I \left( \frac{x _i - x}{h} \right) }\]

where ${ I(x) = \mathbb{1}(x \in [-1, 1]) . }$

Note that the estimator ${ \hat{f} _1 (x) }$ is not smooth.

Q) Can we approximate ${ f(x) }$ smoothly from the sample?

Note that the approximation

\[{ \mathbb{P} \left(-1 < \frac{X - x}{h} < 1 \right) \approx \frac{2}{n} \sum _{i = 1} ^{n} \frac{1}{2} I \left( \frac{x _i - x}{h} \right) }\]

can be replaced by the approximation

\[{ \mathbb{P} \left(-1 < \frac{X - x}{h} < 1 \right) \approx \frac{2}{n} \sum _{i = 1} ^{n} G \left( \frac{x _i - x}{h} \right) }\]

where ${ G }$ is the standard Gaussian density.

Hence we have a smooth estimator

\[{ \boxed{ \hat{f} _2 (x) = \frac{1}{nh} \sum _{i = 1} ^{n} G \left( \frac{x _i - x}{h} \right) } }\]

where ${ G(x) }$ is the standard Gaussian density.

In general consider

\[{ \hat{f}(x) = \frac{1}{nh} \sum _{i = 1} ^{n} K \left( \frac{x _i - x}{h} \right) }\]

where ${ K }$ is a nonnegative function which integrates to ${ 1 . }$

We call this the kernel density estimator of ${ f(x) }$ with bandwidth ${ h > 0 }$ and kernel ${ K . }$

comments powered by Disqus