
0:00 / 0:00
Scatterplots

A scatterplot is a mathematical diagram that illustrates any relationship between two quantitative variables, and , for a set of data, where the plots are a bunch of coordinates.
- The quantitative variable that we are predicting is called the response variable, referred to as Y.
- Y is also known as the dependent variable.
- The quantitative variable that we are basing our predictions on is called the explanatory variable or predictor variable, referred to as X.
- X is also known as the independent variable.
Examples
- The number of transaction made at a store (X) explains the revenue generated (Y).
- The depth of a bathtub (X) determines how much water it could contain (Y).
- The number of hours spent at swim practice (X) predicts the time it takes to finish a lap (Y).
A scatterplot can show:
- A positive (+) association,
- A negative (−) association, or
- No association
Example
Suppose we want to see if there is a relationship between how many hours a student studies the day before the exam and their exam grade. We randomly sampled 8 students:

- Hours studied is the explanatory variable (X)
- Grade is the response variable (Y)
Given the bunch of data (X,Y) above, we can generate a scatterplot:

Important observations
- The response variable (Y) is on the Y-axis; the explanatory (X) is on the X-axis.
- This is always the case, unless it is not obviously apparent (e.g. # incoming calls vs. # outgoing calls) in which case they can go in any axis.
- The variables must be quantitative; we cannot make scatterplots based on categorical variables.
- Each person is represented by a single data point in the scatterplot with an coordinate.
- The mean-mean point is always in the center of the cloud of data points.
- In this example,
Practice: Scatterplots
This question has two parts.
(i) Which of the following may be shown as a scatterplot?