TDSM 10.1

From The Data Science Design Manual Wikia
Jump to: navigation, search

To prove that Euclidean distance is in fact a metric, we should prove these facts:

[math] 1.\hspace{0.2cm}d(x,y) \geq 0 [/math]

[math] 2.\hspace{0.2cm}d(x,y) =0\hspace{0.2cm}\text{ if and only if } x=y [/math]

[math] 3.\hspace{0.2cm}d(x,y) =d(y,x)\text{ for all } x,y [/math]

[math] 4.\hspace{0.2cm}d(x,y) \leq d(x,z) + d(z,y)\text{ for all } x,y,z [/math]

The Euclidean metric is defined as:

[math] d(x,y) = \sqrt{\sum_{i=1}^{D}{\left|x_i-y_i\right|^2}} [/math]


1. It is clear that [math] d(x,y) \geq 0 [/math]

2. [math] d(x,x) = \sqrt{\sum_{i=1}^{D}{\left|x_i-x_i\right|^2}} = 0[/math] and if [math] d(x,x) = 0 [/math] then sum of some nonnegative terms is 0, so all of them should be zero then for all [math] i,x_i=y_i [/math] then [math]x=y[/math]

3. [math] d(x,y) = \sqrt{\sum_{i=1}^{D}{\left|x_i-y_i\right|^2}} = \sqrt{\sum_{i=1}^{D}{\left|y_i-x_i\right|^2}} = d(y,x)[/math]

4. To prove this fact, we first prove Cauchy-Schwartz and then using this theorem we can prove this fact.

Cauchy-Schwartz theorem: if [math] x=(x_1,…,x_D)[/math] and [math] y=(y_1,…,y_D) [/math] then: [math] \left| \sum_{i=1}^{D}x_iy_i\right| \leq (\sum_{i=1}^{D}x_i^2)^{1/2} (\sum_{i=1}^{D}y_i^2)^{1/2}[/math]

Proof: Since [math] \left| \sum_{i=1}^{D}x_iy_i \right| \leq \sum_{i=1}^{D} \left|x_i\right|\left|y_i\right|[/math] it is sufficient to prove the inequality for [math]x_i,y_i \geq 0[/math]. Furthermore, the inequality is obvious if either [math] x=0[/math] or [math] y=0[/math] so we assume at least one [math] x_i[/math] and one [math] y_i[/math] is nonzero.

For every [math]\alpha,\beta\in {\Bbb R} [/math] we have:

[math]0 \leq \sum_{i=1}^{D}(\alpha x_i-\beta y_i)^2[/math]

Expanding the square on the right-hand side and rearranging the terms, we get that:

[math]2\alpha \beta \sum_{i=1}^{D}x_iy_i \leq \alpha^2 \sum_{i=1}^{D}x_i^2 + \beta^2 \sum_{i=1}^{D}y_i^2 [/math]

we choose:

[math] \alpha =(\sum_{i=1}^{D}y_i^2)^{1/2} [/math] , [math] \beta =(\sum_{i=1}^{D}x_i^2)^{1/2} [/math]

Then division of the resulting inequality by [math] 2 \alpha \beta[/math] proves the theorem.

we know that [math] x⋅y=x_1y_1+⋯+x_Dy_D[/math] and [math]\left|x\right|=\sqrt{x⋅x}[/math] and [math]d(x,y)=|x−y|[/math]

by using Cauchy-Schwartz we have:

[math](x+y)^2=(x+y)(x+y)=\left|x^2\right|+2(xy)+\left|y^2\right|\leq \left| x^2 \right|+ 2\left| xy \right| +\left|y^2\right| \leq \left| x^2 \right|+ 2\left| x \right| \left| y \right| +\left|y^2\right| = (\left| x \right| + \left| y \right|) ^2[/math]

which results in:

[math] x+y \leq \left| x \right| + \left| y \right| [/math]

now consider 3 points:

[math] d(x,y)=\left|x-y\right| = \left|x-z+z-y\right| \leq \left|x-z\right| + \left|z-y\right| = d(x,z)+d(z,y)[/math]