High School
SAT
SAT Elite 1500
SAT Tutoring
ACT
ACT Elite 33
ACT Tutoring
University
MCAT
MCAT Elite 515
Med-School Admissions
Pre-Med Tutoring
Pre-Med Plus
LSAT
LSAT Elite 170
LSAT Self-Paced
LSAT Tutoring
DAT
DAT Elite
DAT Tutoring
Log in
Get Started for Free
Regression line: Predictions and Residual Plots
Related Topics
Wize University Statistics Textbook > Simple Linear Regression
Solving for the Regression Line
6 Activities
Wize University Statistics Textbook > Simple Linear Regression
Predictions and Residual Plots
8 Activities
True or false?
If the sum of the residuals is equal to 0, then the best-fitting regression line goes through all the data points.
True
False
I don't know
Check Submission
More Solving for the Regression Line Questions:
Linear Regression
x
ˉ
\bar{x}
x
ˉ
= 60
y
ˉ
\bar{y}
y
ˉ
= 4.8
s
x
s_{x}
s
x
= 38.08
s
y
s_{y}
s
y
= 2.59
r
=
0.71
r=0.71
r
=
0.71
Regression line and Residual plots
Using the data set below, determine the correlation, slope, and intercept of the least squares regression line.
x
ˉ
\bar{x}
x
ˉ
= 60
y
ˉ
\bar{y}
y
ˉ
= 4.8
s
x
s_{x}
s
x
= 38.08
s
y
s_{y}
s
y
= 2.59
Practice (Version 1)
Using the data set below, determine the correlation and intercept of the least squares regression line.
∑
x
=
300
\sum{x}=300
∑
x
=
300
∑
y
=
24
\sum{y}=24
∑
y
=
24
s
x
=
38.08
s_{x}=38.08
s
x
=
38.08
s
y
=
2.59
s_{y}=2.59
s
y
=
2.59
Correlation and Regression Line
If
r
=
−
1.00
r=-1.00
r
=
−
1.00
, then 100% of the data points fall exactly on the regression line and the slope cannot be equal or greater than 0.
Regression Line
The following data is obtained to assess the relationship between height in cm and shoe size.
The correlation coefficient is 0.93
Which of the following is the equation of the regression line?
Regression Line
One day George wanted to examine the linear relationship between how many hours a student studies (
x
x
x
) the day before the exam and the exam grade (
y
y
y
).
He asks 13 random students and collects the following data
r
=
0.819
s
x
=
3.24
s
y
=
14.24
x
‾
=
8.25
y
‾
=
78.25
r=0.819\ \ s_x=3.24\ \ s_y=14.24\ \ \overline{x}=8.25\ \ \overline{y}=78.25
r
=
0.819
s
x
=
3.24
s
y
=
14.24
x
=
8.25
y
=
78.25
Regression Line
In investigating the relationship between number of people on a person's phone list (
x
x
x
) and the number of text messages per day (
y
y
y
), 40 individuals were surveyed and the following information was recorded:
∑
i
=
1
40
x
i
=
700
s
x
=
3.5
\sum_{i=1}^{40}x_i=700 \quad s_x=3.5
i
=
1
∑
40
x
i
=
700
s
x
=
3.5
∑
i
=
1
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
\sum_{i=1}^{40}y_i=3200 \quad s_y=18.9 \quad r=0.67
i
=
1
∑
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
Regression Line
The equation
y
^
=
3
+
0.025
x
\hat y=3+0.025x
y
^
=
3
+
0.025
x
is used to predict the amount of weight lost
y
y
y
(in kg) by a group of individuals in a study based on the amount of their daily exercise
x
x
x
(in minutes). Given that
1
k
g
≈
2.2
l
b
1kg\approx2.2lb
1
k
g
≈
2.2
l
b
, suppose we want to change the amount of weight lost to pounds (lb) in this equation, which of the following will be the correct linear regression line?
Slope of a Regression Line
Suppose we want to see if there is a relationship between the number of hours a student studies the day before their exam (x) and their exam grade (y). We randomly sample 8 students and record our results, the correlation coefficient is 0.819.
If a student studies an additional 2 hours the day before their final exam, by how much can they expect their mark to increase?
Linear Regression
Here's the data and summary of the weekly earnings and amount of coffee purchased by 5 different students:
x
ˉ
\bar{x}
x
ˉ
= 60
y
ˉ
\bar{y}
y
ˉ
= 4.8
s
x
s_{x}
s
x
= 38.08
s
y
s_{y}
s
y
= 2.59
r
=
0.71
r=0.71
r
=
0.71
Regression Line: Changing Units of $x$
The equation
y
^
=
35
+
1.50
x
\hat y=35+1.50x
y
^
=
35
+
1.50
x
is used to predict the hourly wage y of someone with x years of experience.
Suppose that x is changed to months of experience. Which of the following will be the correct regression line?
Regression line
Suppose we want to see if there is a relationship between the size of a condo unit and the selling price of it. We randomly sampled 32 units:
r
=
0.762
s
x
=
278.87
s
y
=
60350.59
x
ˉ
=
1034.84
y
ˉ
=
445734.38
\begin{array}{c}r=0.762\\s_x=278.87\\s_y=60350.59\\\bar{x}=1034.84\\\bar{y}=445734.38\end{array}
r
=
0.762
s
x
=
278.87
s
y
=
60350.59
x
ˉ
=
1034.84
y
ˉ
=
445734.38
With this linear regression plot
Regression Line
Suppose we want to see if there is a relationship between the size of a condo unit and the selling price of it. We randomly sampled 32 units:
r
=
0.762
s
x
=
278.87
s
y
=
60350.59
x
ˉ
=
1034.84
y
ˉ
=
445734.38
\begin{array}{c}r=0.762\\s_x=278.87\\s_y=60350.59\\\bar{x}=1034.84\\\bar{y}=445734.38\end{array}
r
=
0.762
s
x
=
278.87
s
y
=
60350.59
x
ˉ
=
1034.84
y
ˉ
=
445734.38
With this linear regression plot
Regression Line
Suppose we want to see if there is a relationship between the size of a condo unit and the selling price of it. We randomly sampled 32 units:
r
=
0.762
s
x
=
278.87
s
y
=
60350.59
x
ˉ
=
1034.84
y
ˉ
=
445734.38
\begin{array}{c}r=0.762\\s_x=278.87\\s_y=60350.59\\\bar{x}=1034.84\\\bar{y}=445734.38\end{array}
r
=
0.762
s
x
=
278.87
s
y
=
60350.59
x
ˉ
=
1034.84
y
ˉ
=
445734.38
With this linear regression plot
Regression Line: Calculation
One day George wanted to examine the linear relationship between how many hours a student studies (
x
x
x
) the day before the exam and the exam grade (
y
y
y
).
He asks 13 random students and collects the following data
r
=
0.819
s
x
=
3.24
s
y
=
14.24
x
‾
=
8.25
y
‾
=
78.25
r=0.819\ \quad s_x=3.24\ \quad s_y=14.24\ \quad\overline{x}=8.25\ \ \overline{y}=78.25
r
=
0.819
s
x
=
3.24
s
y
=
14.24
x
=
8.25
y
=
78.25
Regression Line: Challenging Use of Slope
Suppose we want to see if there is a relationship between the number of hours a student studies the day before their exam (x) and their exam grade (y). We randomly sample 8 students and record our results. The correlation coefficient is 0.819.
If a student studies an additional 2 hours the day before their final exam, by how much can they expect their mark to increase?
Regression Line: Changing Units of X
The equation
y
^
\hat{y}
y
^
=35+1.50
x
x
x
is used to predict the hourly wage y of someone with x years of experience. Suppose that x is changed to months of experience. Which of the following will be the correct regression line?
Regression Line: Changing Units of Y Example
Problem
EThe equation
w
e
i
g
h
t
L
o
s
t
^
\widehat{weight Lost}
w
e
i
g
h
t
L
os
t
=3+0.025 (
e
x
e
r
c
i
s
e
^
\widehat{exercise}
e
x
er
c
i
se
) is used to predict the amount of weight lost, in kilograms, by a group of individuals in a study based on the average amount of daily exercise, in minutes. Each kilogram is approximately 2.2 pounds. Suppose that we change weight lost to pounds. Which of the following describes the corresponding linear regression line?
Linear Regression
x
ˉ
\bar{x}
x
ˉ
= 60
y
ˉ
\bar{y}
y
ˉ
= 4.8
s
x
s_{x}
s
x
= 38.08
s
y
s_{y}
s
y
= 2.59
r
=
0.71
r=0.71
r
=
0.71
Regression Line
Refer to Question 19.
What is the equation of the regression line?
Regression line and Residual plots
Using the data set below, determine the correlation, slope, and intercept of the least squares regression line.
x
ˉ
\bar{x}
x
ˉ
= 60
y
ˉ
\bar{y}
y
ˉ
= 4.8
s
x
s_{x}
s
x
= 38.08
s
y
s_{y}
s
y
= 2.59
Solving for the Regression Line
The equation
y
^
=
3
+
0.025
x
\hat{y}=3+0.025x
y
^
=
3
+
0.025
x
is used to predict the amount of weight lost
y
y
y
, in kilograms, by a group of individuals in a study based on the average amount of daily exercise
x
x
x
, in minutes. Each kilogram is approximately 2.2 pounds.
Suppose that we change weight lost to pounds. Determine the new linear regression line.
Regression Line
What is the intercept of the regression line?
Regression Line
Suppose Big Joe is added to the scatterplot below. He is 13 years old and has 50% body fat. What will happen to the slope and correlation coefficient?
Correlation and Regression Line
Suppose we want to see if there is a relationship between the number of times a student attended their labs in the entire semester (x) and their final exam grade (y). We randomly sample 8 students and recorded our results.
∑
(
x
i
−
x
‾
)
(
y
i
−
y
‾
)
=
264.6
\sum_{ }^{ }\left(x_i-\overline{x}\right)\left(y_i-\overline{y}\right)=264.6
∑
(
x
i
−
x
)
(
y
i
−
y
)
=
264.6
More Predictions and Residual Plots Questions:
You wish to predict the number of times someone logs in their email a day. The estimated linear regression model is
L
o
g
i
n
s
=
5
+
1.3
(
w
o
r
k
)
Logins\ =\ 5\ +\ 1.3\left(work\right)
L
o
g
in
s
=
5
+
1.3
(
w
or
k
)
where Logins is the number of times a person logs in their email a week and work is the number of hours of work done a week. Logan works 30 hours a week and checks his email 49 times a week. What is his residual?
Predictions and Residual Plots
You wish to predict the number of times someone logs in their email a day.
The estimated linear regression model is
L
o
g
i
n
s
=
5
+
1.3
(
w
o
r
k
)
Logins=5+1.3\left(work\right)
L
o
g
in
s
=
5
+
1.3
(
w
or
k
)
where Logins is the number of times a person logs in their email a week and
Predictions and Residual Plots
One day George wanted to examine the linear relationship between how many hours a student studies (
x
x
x
) the day before the exam and the exam grade (
y
y
y
).
He asks 13 random students and collects the following data
r
=
0.819
s
x
=
3.24
s
y
=
14.24
x
‾
=
8.25
y
‾
=
78.25
r=0.819\ \ s_x=3.24\ \ s_y=14.24\ \ \overline{x}=8.25\ \ \overline{y}=78.25
r
=
0.819
s
x
=
3.24
s
y
=
14.24
x
=
8.25
y
=
78.25
Predictions and Residual Plots
Suppose the following summary data was given:
x-variable:
μ
x
=
10
,
s
x
=
2
\mu_x=10,\ s_x=2
μ
x
=
10
,
s
x
=
2
y-variable:
μ
y
=
250
,
s
y
=
10
\mu_y=250,\ s_y=10
μ
y
=
250
,
s
y
=
10
Linear Regression: Residual Plots
The equation
y
^
=
35
+
1.50
x
\hat{y}=35+1.50x
y
^
=
35
+
1.50
x
is used to predict the hourly wage
y
y
y
of someone with
x
x
x
years
of experience.
Elizabeth has 4 years of experience and earns $36 per hour. What is the residual?
Predictions: Residual Plots
In investigating the relationship between number of people on a person's phone list (
x
x
x
) and the number of text messages per day (
y
y
y
), 40 individuals were surveyed and the following information was recorded:
∑
i
=
1
40
x
i
=
700
s
x
=
3.5
\sum_{i=1}^{40}x_i=700 \quad s_x=3.5
i
=
1
∑
40
x
i
=
700
s
x
=
3.5
∑
i
=
1
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
\sum_{i=1}^{40}y_i=3200 \quad s_y=18.9 \quad r=0.67
i
=
1
∑
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
Predictions and Residual Plots
In linear regression, which of the following indicates that we should fit the scatter plot with a line of best fit?
Predictions and Residual Plots
Suppose we want to see if there is a relationship between the size of a condo unit and the selling price of it. We randomly sampled 16 units:
(a) Estimate the linear regression model.
(b) Unit #3 is 1250 sqft, what is his predicted selling price? Compute the residual for this estimate.
Predictions and Residual Plots
Suppose we want to see if there is a relationship between the size of a condo unit and the selling price of it. We randomly sampled 16 units, here are the results and the regression output:
(a) Estimate the linear regression model.
(b) Unit #3 is 1250 sqft, what is his predicted selling price? Compute the residual for this estimate.
Predictions and Residual Plots
In investigating the relationship between number of people on a person's phone list (
x
x
x
) and the number of text messages per day (
y
y
y
), 40 individuals were surveyed and the following information was recorded:
∑
i
=
1
40
x
i
=
700
s
x
=
3.5
\sum_{i=1}^{40}x_i=700 \quad s_x=3.5
i
=
1
∑
40
x
i
=
700
s
x
=
3.5
∑
i
=
1
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
\sum_{i=1}^{40}y_i=3200 \quad s_y=18.9 \quad r=0.67
i
=
1
∑
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
Predictions and Residual Plots
Cindy calculated residuals and find that all of them are negative. Which of the following is true?
Predictions and residual plots
The following assertions about the three residual are True or False
Residual Plot A: No problem – unbiased and homoscedastic (constant variance)
Residual Plot B: Heteroscedasticity – we see a “fan”, indication of non-constant variance across the range of values
Predictions: Residual Plots
In investigating the relationship between number of people on a person's phone list (
x
x
x
) and the number of text messages per day (
y
y
y
), 40 individuals were surveyed and the following information was recorded:
∑
i
=
1
40
x
i
=
700
s
x
=
3.5
\sum_{i=1}^{40}x_i=700\quad\ \ \ s_x=3.5
i
=
1
∑
40
x
i
=
700
s
x
=
3.5
∑
i
=
1
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
\sum_{i=1}^{40}y_i=3200\ \ \quad s_y=18.9\quad\ \ r=0.67
i
=
1
∑
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
Residual Plots: Calculation
In investigating the relationship between number of people on a person's phone list (
x
x
x
) and the number of text messages per day (
y
y
y
), 40 individuals were surveyed and the following information was recorded:
∑
i
=
1
40
x
i
=
700
s
x
=
3.5
\sum_{i=1}^{40}x_i=700 \quad s_x=3.5
i
=
1
∑
40
x
i
=
700
s
x
=
3.5
∑
i
=
1
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
\sum_{i=1}^{40}y_i=3200 \quad s_y=18.9 \quad r=0.67
i
=
1
∑
40
y
i
=
3200
s
y
=
18.9
r
=
0.67
Predictions and Residual Plots
You wish to predict the number of times someone logs in their email a day. The estimated linear regression model is
L
o
g
i
n
s
=
5
+
1.3
(
w
o
r
k
)
Logins\ =\ 5\ +\ 1.3\left(work\right)
L
o
g
in
s
=
5
+
1.3
(
w
or
k
)
where Logins is the number of times a person logs in their email a week and work is the number of hours of work done a week. Logan works 30 hours a week and checks his email 49 times a week. What is his residual?
Predictions and Residual Plots
The regression model
y
=
4000
−
2.7
x
y=4000-2.7x
y
=
4000
−
2.7
x
is used to estimate the annual number of cars going over the speed limit (y) based on the number of police officers on the road. Estimate the number of cars going over the speed limit when there are 40 police officers on the road.
Predictions and Residual Plots: Regression Diagnostics Question
Let’s take a look at our residual plot in our example:
Regression Diagnostics (check all that applies):
[ ] Outliers
Predictions: Residual Plots
(v) Mike actually has 8% body fat. What is the residual?
Correlation and Residual Plots
Pamela and Tim are trying to be healthy by taking the stairs instead of the elevator. We want to see there is a relationship between the floor you live on and the time it takes to get to your floor from the lobby. Pamela lives on the 8th floor. Tim lives in the penthouse on the 20th floor.
∑
x
2
=
258
\sum_{ }^{ }x^2=258
∑
x
2
=
258
∑
y
2
=
687
\sum_{ }^{ }y^2=687
∑
y
2
=
687
Residual Plots
Here are three residual plots. Which of the following is true?
Predictions and Residual Plots
The equation
y
^
\hat{y}
y
^
=35+1.50
x
x
x
is used to predict the hourly wage
y
y
y
of someone with
x
x
x
years
of experience.
(c) Elizabeth has 4 years of experience and earns $36. What is the residual?
Simple Linear Regression
Here is a residual plot for selling price (Y) vs. size of a condo in square-feet (X).
Which condition is violated?