0:00 / 0:00

Predictions

When you have the linear regression equation y^=b0+b1x\hat{y}=b_0+b_1x, you can make predictions along it.

The variable xix_i predicts yi^\hat{y_i}. In other words, you plug in xix_i to solve for y^i \hat y_i\ .

Example


Portions of information contained in this publication/book are printed with permission of Minitab, LLC. All such material remains the exclusive property and copyright of Minitab, LLC. All rights reserved.


PAGE BREAK
Example


y^=58.18+2.52x\hat{y}=58.18+2.52x

If Phil studied for 12 hours, what is his predicted grade?

y^=58.18+2.52(12)=88.42\hat{y}=58.18+2.52(12)=88.42


Brendan studied for 20 hours. Can we predict he’ll get 109%?

No - we cannot extrapolate! We can only make predictions within the data range x=0x=0 to 1515


0:00 / 0:00

Residuals (Errors)


A residual (or "error") is the vertical distance between the observed value (or actual value) yiy_i and the predicted value yi^\hat{y_i} for all values of xix_i in the data set.

ei=yiyi^\boxed{e_i=y_i-\hat{y_i}}


Wize Concept
An "error" is not a mistake; eie_i simply measures how much your prediction has overestimated or underestimated.
If ei=0e_i=0, that means what your predicted y^i\hat y_iequals to the actual yiy_i value.

PAGE BREAK
Example


y^=58.18+2.52x\hat{y}=58.18+2.52x

What is the residual for x=12x=12?

y^=58.18+2.52(12)88\hat{y}=58.18+2.52(12)\approx88

From the table in the previous page, the observed value for x=12x=12 is yi=85y_i=85.




Since we predicted yi^\hat{y_i} = 88, the residual for x=12x=12 is:

ei=yiyi^e_i=y_i-\hat{y_i}
ei=8588=3e_i=85-88=-3



Wize Concept
The sum of all eie_i is always 0. The requirement for “best-fit” is that the sum of squared residuals is minimized; hence, why we call it “least squares”.




The equation y^=35+1.50x\hat{y}=35+1.50x is used to predict the hourly wage yy of someone with xx years of experience.

Elizabeth has 4 years of experience and earns $36 per hour. What is the residual?



0:00 / 0:00

Residual Plots


A residual plot (eie_i on y-axis and xix_i on x-axis) helps us assess if the regression model is appropriate or not. It is appropriate if the residual plot shows no pattern.



  • In the residual plot (on the right), where the residuals ei=yiyi^e_i=y_i-\hat{y_i} are on the y-axis and plotted against the x-variable for every data point (xi,yix_i,y_i) .
  • This residual plot in particular shows a nice linear relationship because the residuals fall randomly above and below the “0” line.


Wize Concept
If the data point lands exactly on the “0” line, it means yi=yi^y_{i}=\hat{y_i} and ei=0e_i=0. "No error."



PAGE BREAK

Standard Deviation of the Residuals

The standard error of the residuals is a measure of the accuracy of predictions. The smaller the standard error of the estimate is, the more accurate the predictions are.


Se=SSEn2=SSyy(SSxy)2SSxxn2S_e=\sqrt{\frac{SSE}{n-2}}=\sqrt{\frac{SS_{yy}-\frac{(SS_{xy})^2}{SS_{xx}}}{n-2}}

The reason n − 2 is used (and not n − 1) is that two parameters (i.e. the slope and the intercept) were estimated in order to estimate the sum of squares.




0:00 / 0:00

Regression Diagnostics (L.I.N.E.)

A linear regression model may not always be the most appropriate for the data you have. You should assess the appropriateness of the model by examining your residual plot by checking if any of the conditions are violated.

Common Types of Violated Conditions

Match the residual plots with the conditions violated.

L. Linearity condition violated.

I. Independence condition violated.

N. Normality condition violated (skewed)

E. Equal spread (constant variance) condition violated.



Top row (left to right): L, N
Bottom row (left to right): E, I

Here is a residual plot for selling price (Y) vs. size of a condo in square-feet (X).

Which condition is violated?

0:00 / 0:00

Standard Deviation of the Residuals

The standard error of the residuals (or standard error of the estimate) is a measure of the accuracy of predictions. Specifically, it measures how far the data points are from the best fit regression line overall. In regression, it is also loosely called the overall standard deviation.


The smaller the standard error of the estimate is, the more accurate the predictions are overall.

Linear regression where k=1k=1 (one explanatory variable):
Se=SSyy(SSxy)2SSxxn2\displaystyle\boxed{S_e=\sqrt{\frac{SS_{yy}-\frac{(SS_{xy})^2}{SS_{xx}}}{n-2}}}

Wize Concept
The reason n2n-2 is used (and not n1n-1) is that two parameters (i.e. the slope and the intercept) are estimated in order to estimate the sum of squares.

General formula for any value of kk:

Se=SSyy(SSxy)2SSxxnk1\displaystyle\boxed{S_e=\sqrt{\frac{SS_{yy}-\frac{(SS_{xy})^2}{SS_{xx}}}{n-k-1}}}

PAGE BREAK
Standard Deviation of the Residuals vs. R Squared

In addition to R Squared, the standard deviation of the residuals is used to access goodness-of-fit in regression analysis.

There is a negative correlation between R2R^2 and SeS_e:
  • As SeS_e decreases, R2R^2 increases.
  • A low SeS_e tells us that the distances between the data points and the fitted values are small overall, indicating better fit; hence a higher R2R^2 .

  • As SeS_e increases, R2R^2 decreases.
  • A high SeS_e tells us that the distances between the data points and the fitted values are large overall, indicating poorer fit; hence a lower R2R^2 .

PAGE BREAK

Examples

Lower Se\colorThree{S_e}

  • Since the overall standard deviation is relatively low, the can make more accurate predictions.
  • Residuals will be low for predictions.

PAGE BREAK

Higher Se\colorThree{S_e}

  • Since the overall standard deviation is relatively high, the predictions will not be as accurate.
  • Residuals will be higher for predictions.

Practice: Standard Deviation of the Residuals

Using the information below, solve for the standard deviation of the residuals SeS_e.

n=5SSxx=50SSyy=518SSxy=159n=5\\SS_{xx}=50\\SS_{yy}=518\\SS_{xy}=159



k=1k=1 (simple linear regression)