0:00 / 0:00

Solving for Correlation


r=r= correlation, where 1r1-1\le r\le1


r=1n1(xixˉSx)(yiyˉSy)=XY(X)(Y)n(n1)SxSy=(SSxy)2(SSxx)(SSyy)\displaystyle\boxed{ \large r=\frac{1}{n-1}\sum\Big(\frac{x_i-\bar{x}}{S_x}\Big)\Big(\frac{y_i-\bar{y}}{S_y}\Big)=\frac{\sum XY-\frac{(\sum X)(\sum Y)}{n}}{(n-1)S_xS_y}=\sqrt{\frac{\color{green}(SS_{xy})\color{black}^2}{\color{blue}(SS_{xx})\color{Red}(SS_{yy})}}}


where
Sx=(xixˉ)2n1=SSxxn1\large S_x=\sqrt\frac{\sum(x_i-\bar{x})^2}{n-1}=\sqrt{\frac{SS_{xx}}{n-1}}


Sy=(yiyˉ)2n1=SSyyn1\large S_y=\sqrt{\frac{\sum(y_i-\bar{y})^2}{n-1}}=\sqrt{\frac{SS_{yy}}{n-1}}


SSxx=(xixˉ)2=xi2(xi)2n\color{blue}SS_{xx}=\sum(x_i-\bar{x})^2=\sum x^2_i-\frac{(\sum x_i)^2}{n}

SSyy=(yiyˉ)2=yi2(yi)2n\color{red}SS_{yy}=\sum(y_i-\bar{y})^2=\sum y^2_i-\frac{(\sum y_i)^2}{n}

SSxy=(xixˉ)(yiyˉ)=xiyi(xi)(yi)n\color{green}SS_{xy}=\sum(x_i-\bar{x})(y_i-\bar{y})=\sum x_iy_i-\frac{(\sum x_i)(\sum y_i)}{n}


PAGE BREAK

Wize Concept
We can also rewrite the the following formula:
r=XY(X)(Y)n(n1)SxSy=XYn(X)n(Y)n(n1)SxSy=XYnx y(n1)SxSy\displaystyle{r=\frac{\sum XY-\frac{(\sum X)(\sum Y)}{n}}{(n-1)S_xS_y}=\frac{\sum_{ }^{ }XY-{n}\frac{(\sum_{ }^{ }X)}{n}\frac{(\sum_{ }^{ }Y)}{n}}{(n-1)S_xS_y}=\frac{\sum_{ }^{ }XY-n\overline{x}\ \overline{y}}{(n-1)S_xS_y}}


Exam Tip
There are several ways calculate rr depending on what information you are given:

X,Y,Sx,Sy,xˉ,yˉ,n,SSxx,SSyy,SSxy,(xixˉ)2,(yiyˉ)2\sum X,\sum Y,S_x,S_y,\bar{x},\bar{y},n,SS_{xx},SS_{yy},SS_{xy},\sum(x_i-\bar{x})^2,\sum(y_i-\bar{y})^2





0:00 / 0:00

Example: Solving for Correlation


Find the correlation coefficient and interpret it.



x=30\sum x=30
y=25\sum y=25
xy=122\sum xy=122
x2=220\sum x^2=220
y2=159\sum y^2=159


r=1n1(xixˉSx)(yiyˉSy)=XY(X)(Y)n(n1)SxSy=(SSxy)2(SSxx)(SSyy)\displaystyle{ \large r=\frac{1}{n-1}\sum\Big(\frac{x_i-\bar{x}}{S_x}\Big)\Big(\frac{y_i-\bar{y}}{S_y}\Big)=\frac{\sum XY-\frac{(\sum X)(\sum Y)}{n}}{(n-1)S_xS_y}=\sqrt{\frac{\color{green}(SS_{xy})\color{black}^2}{\color{blue}(SS_{xx})\color{Red}(SS_{yy})}}}


PAGE BREAK

SSxx=(xixˉ)2=xi2(xi)2n\color{blue}SS_{xx}=\sum(x_i-\bar{x})^2=\sum x^2_i-\frac{(\sum x_i)^2}{n}
SSxx=220(30)25=40SS_{xx}=220-\frac{\left(30\right)^2}{5}=40
SSyy=(yiyˉ)2=yi2(yi)2n\color{red}SS_{yy}=\sum(y_i-\bar{y})^2=\sum y^2_i-\frac{(\sum y_i)^2}{n}

SSyy=159(25)25=34SS_{yy}=159-\frac{\left(25\right)^2}{5}=34


SSxy=(xixˉ)(yiyˉ)=xiyi(xi)(yi)n\color{green}SS_{xy}=\sum(x_i-\bar{x})(y_i-\bar{y})=\sum x_iy_i-\frac{(\sum x_i)(\sum y_i)}{n}

SSxy=122(30)(25)5=28SS_{xy}=122-\frac{\left(30\right)\left(25\right)}{5}=-28


r=(SSxy)2(SSxx)(SSyy) \large r=\sqrt{\frac{\color{green}(SS_{xy})\color{black}^2}{\color{blue}(SS_{xx})\color{Red}(SS_{yy})}}

r=(28)2(40)(34)=0.5765=0.7593r=-\sqrt{\frac{\left(28\right)^2}{\left(40\right)\left(34\right)}}=-\sqrt{0.5765}=-0.7593

Exam Tip
There are multiple ways to solve for rr. Use the formula based on what variables are given.



Pretty strong, negative correlation.




We want to predict a person’s body fat based on their age. Age and percentage body fat were measured in 18 adult males.

x=702y=461xy=19,871x2=31,396y2=13,449\begin{array}{ll}\sum x=702\\\sum y=461\\\sum xy=19,871\\\sum x^2=31,396\\\sum y^2=13,449\end{array}





(i) What is the average age in the sample?
0:00 / 0:00

Example: Solving for Correlation

Solve for the correlation coefficient.


x=42\sum_{ }^{ }x=42
y=235\sum_{ }^{ }y=235
xy=1911\sum_{ }^{ }xy=1911
Sx=3.3466S_x=3.3466
Sy=18.7341S_y=18.7341


r=1n1(xixˉSx)(yiyˉSy)=XY(X)(Y)n(n1)SxSy=(SSxy)2(SSxx)(SSyy) \displaystyle{\large r=\frac{1}{n-1}\sum\Big(\frac{x_i-\bar{x}}{S_x}\Big)\Big(\frac{y_i-\bar{y}}{S_y}\Big)=\frac{\sum XY-\frac{(\sum X)(\sum Y)}{n}}{(n-1)S_xS_y}=\sqrt{\frac{\color{green}(SS_{xy})\color{black}^2}{\color{blue}(SS_{xx})\color{Red}(SS_{yy})}}}


There are multiple ways to solve for r. Based on the variables are given in this question, we use this formula:


r=XY(X)(Y)n(n1)SxSy\large r=\frac{\sum XY-\frac{(\sum X)(\sum Y)}{n}}{(n-1)S_xS_y}

r=1911(42)(235)6(61)(3.3466)(18.7341)=266313.4815=0.8485r=\frac{1911-\frac{\left(42\right)\left(235\right)}{6}}{\left(6-1\right)\left(3.3466\right)\left(18.7341\right)}=\frac{266}{313.4815}=0.8485


Strong, positive correlation.






X=33,115\sum X=33,115
Y=14,263,500\sum Y=14,263,500
XY=15,157,905,000\sum XY=15,157,905,000

Solve for the correlation coefficient.


Extra Practice