Difference between revisions of "TDSM 2.11"

From The Data Science Design Manual Wikia
Jump to: navigation, search
(Created page with "Let <math>X</math> be the annual salaries of high school graduates <math>Y</math> be the annual salaries of high school graduates <math>n</math> be the number of job posit...")
 
(fixed typo)
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Let <math>X</math> be the annual salaries of high school graduates  
 
Let <math>X</math> be the annual salaries of high school graduates  
  
<math>Y</math> be the annual salaries of high school graduates  
+
<math>Y</math> be the annual salaries of college graduates  
  
 
<math>n</math> be the number of job positions
 
<math>n</math> be the number of job positions
Line 9: Line 9:
 
<math>\Rightarrow \bar{Y} = \bar{X} + 5000</math> and <math> \forall i (1 \leq i \leq n): Y_i = X_i + 5000 </math>  
 
<math>\Rightarrow \bar{Y} = \bar{X} + 5000</math> and <math> \forall i (1 \leq i \leq n): Y_i = X_i + 5000 </math>  
  
Correlation efficient of <math>X</math> and <math>Y</math>:
+
Correlation coefficient of <math>X</math> and <math>Y</math>:
  
 
<math>
 
<math>
Line 18: Line 18:
 
</math>
 
</math>
  
 +
b) For each possible job title, the college graduates always made 25% more than high school grads
 +
 +
<math>\Rightarrow \bar{Y} = 1.25 \bar{X}</math> and <math> \forall i (1 \leq i \leq n): Y_i = 1.25 X_i </math>
  
b) For each possible job title, the college graduates always made 25% more than high school grads
+
Correlation coefficient of <math>X</math> and <math>Y</math>:
  
<math>\Rightarrow</math>
+
<math>
 +
\tau = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{\sum_{i = 1}^{n}(Y_i - \bar{Y})^2}}
 +
= \frac{\sum_{i = 1}^{n}(X_i - \bar{X}) \cdot 1.25 (X_i - \bar{X})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{ \sum_{i = 1}^{n}1.25^2(X_i - \bar{X})^2}}
 +
= \frac{1.25 \sum_{i = 1}^{n}(X_i - \bar{X})^2} {1.25 \left( \sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \right)^2} 
 +
= 1
 +
</math>
  
 
c) For each possible job title, the college graduates always made 15% less than high school grads
 
c) For each possible job title, the college graduates always made 15% less than high school grads
  
<math>\Rightarrow</math>
+
 
 +
<math>\Rightarrow \bar{Y} = 0.85 \bar{X}</math> and <math> \forall i (1 \leq i \leq n): Y_i = 0.85 X_i </math>
 +
 
 +
Correlation coefficient of <math>X</math> and <math>Y</math>:
 +
 
 +
<math>
 +
\tau = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{\sum_{i = 1}^{n}(Y_i - \bar{Y})^2}}
 +
= \frac{\sum_{i = 1}^{n}(X_i - \bar{X}) \cdot 0.85 (X_i - \bar{X})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{ \sum_{i = 1}^{n}0.85^2(X_i - \bar{X})^2}}
 +
= \frac{0.85 \sum_{i = 1}^{n}(X_i - \bar{X})^2} {0.85 \left( \sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \right)^2} 
 +
= 1
 +
</math>

Latest revision as of 15:30, 12 December 2017

Let [math]X[/math] be the annual salaries of high school graduates

[math]Y[/math] be the annual salaries of college graduates

[math]n[/math] be the number of job positions

a) For each possible job title, the college graduates always made 5,000 dollars more than high school grads

[math]\Rightarrow \bar{Y} = \bar{X} + 5000[/math] and [math] \forall i (1 \leq i \leq n): Y_i = X_i + 5000 [/math]

Correlation coefficient of [math]X[/math] and [math]Y[/math]:

[math] \tau = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{\sum_{i = 1}^{n}(Y_i - \bar{Y})^2}} = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(X_i + 5000 - (\bar{X} + 5000))}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{\sum_{i = 1}^{n}(X_i + 5000 - (\bar{X} + 5000))^2}} = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})^2} {\left( \sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \right)^2} = 1 [/math]

b) For each possible job title, the college graduates always made 25% more than high school grads

[math]\Rightarrow \bar{Y} = 1.25 \bar{X}[/math] and [math] \forall i (1 \leq i \leq n): Y_i = 1.25 X_i [/math]

Correlation coefficient of [math]X[/math] and [math]Y[/math]:

[math] \tau = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{\sum_{i = 1}^{n}(Y_i - \bar{Y})^2}} = \frac{\sum_{i = 1}^{n}(X_i - \bar{X}) \cdot 1.25 (X_i - \bar{X})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{ \sum_{i = 1}^{n}1.25^2(X_i - \bar{X})^2}} = \frac{1.25 \sum_{i = 1}^{n}(X_i - \bar{X})^2} {1.25 \left( \sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \right)^2} = 1 [/math]

c) For each possible job title, the college graduates always made 15% less than high school grads


[math]\Rightarrow \bar{Y} = 0.85 \bar{X}[/math] and [math] \forall i (1 \leq i \leq n): Y_i = 0.85 X_i [/math]

Correlation coefficient of [math]X[/math] and [math]Y[/math]:

[math] \tau = \frac{\sum_{i = 1}^{n}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{\sum_{i = 1}^{n}(Y_i - \bar{Y})^2}} = \frac{\sum_{i = 1}^{n}(X_i - \bar{X}) \cdot 0.85 (X_i - \bar{X})}{\sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \sqrt{ \sum_{i = 1}^{n}0.85^2(X_i - \bar{X})^2}} = \frac{0.85 \sum_{i = 1}^{n}(X_i - \bar{X})^2} {0.85 \left( \sqrt{\sum_{i = 1}^{n}(X_i - \bar{X})^2} \right)^2} = 1 [/math]