0:00 / 0:00

Plotting & Interpreting Data

Numerical Data

Numerical data is information or facts about a variable. There are two main types of numerical data:
  1. Continuous data -- you measure these
  2. Discrete data -- you count these


PAGE BREAK
When we don't know if two variables are related, we can perform an experiment to collect numerical data about the variables in order to see if there's a relationship between the variables.

Experiment Example 1
An experiment can be in the form of a survey asking home owners the number of rooms in their house and their house prices.
  • One variable is
    the number of rooms
    , which is a
    discrete
    numerical data
  • Another variable is
    the house price
    , which is a
    discrete/continuous
    numerical data

Example 2
An experiment can be in the form of dropping an egg from different floors of a building and recording the time it takes the egg to reach the ground
  • One variable is
    the height in which the egg is dropped
    , which is a
    continuous
    numerical data
  • Another variable is
    the time it takes the egg to reach the ground
    , which is a
    continuous
    numerical data

Example 3
An experiement can be in the form of gathering information on the age a patient when they are first diagnosed with lung cancer and the patient's life expectancy
  • One variable is
    the age of the pateint
    , which is a
    continuous/discrete
    numerical data
  • Another variable is
    the patient's life expectancy
    , which is a
    continuous/discrete
    numerical data
PAGE BREAK

How to organize numerical data

It is common to organize the numerical data gathered from an experiment using a table of values.

Then we can plot this data on a scatter plot:
  1. Identify the indpendent (horizontal axis) and dependent (vertical axis) variable in the experiment
  2. Plot the points from the table of values on the graph
  3. Observe whether there's a relationship between the variables

Wize Tip
Sometimes we can connect the data points to see if there's a trend/relationship between the variables.
  • If the data is continuous, we connect the points using a solid line (all values in between points are possible)
  • If the data is discrete, we connect the points using a dotted line (it's not possible for the variable to take on values in between points)

0:00 / 0:00

Example: Plotting & Interpretting Data

The following table shows the numerical data for 10 houses.


a) Identify the independent and dependent variable.

Since the house price will typically depend on the number of rooms, the independent variable is the # of rooms, and the dependent variable is the house price.

b) Do the variables represent continuous or discrete data?
  • We can count the # of rooms -- it is discrete
  • House prices are represent in large dollar amounts -- it is continuous (technically money can be discrete since we can count money and the increase is in step amounts, but since the dollar values are really large in this example, we can consider house prices as continuous data)

c) Construct a scatter plot for the data.

The horizontal axis represents the number of rooms. The vertical axis represents the house price in thousands of dollars.

d) Does the scatter plot suggest that the # of rooms in a house is related to the house price?

Since the points seem to be going up and to the right, it suggests that the number of rooms and house prices are related -- the more rooms there are, the more expensive the house price seems to be.

Practice: Interpretting Data

The following graph represents the temperature (in °F\degree F) at different months of the year (from January to June).

Which is the independent variable and the dependent variable?

Practice: Plotting Data


There's a species of fish called the Giant Trevally that can jump out of water to catch and eat birds that are flying just above the water. The following table shows the height of one of these fish.


a) Is time the independent or dependent variable in this situation?

b) Is height the independent or dependent variable in this situation?

c) Create a scatter plot for this data.

Practice: Plotting & Interpretting Data

The number of rabbits in a certain forest is given by the following table of values


a) Identify the independent and dependent variable.

b) Create a scatter plot for this data.

c) Does there appear to be a relationship between the variables.

d) Describe this relationship if there is one.
0:00 / 0:00

Curve of Best Fit

Sometimes the data points in a scatter plot can follow a pattern/trend that is not a straight line. In these situations, it might be more appropriate to draw a smooth curve that follows the pattern of the data points -- this is called the curve of best fit.

Note
In this case, the two variables do not follow a linear relation!


Wize Tip
When sketching the curve of best fit, the rule for discrete vs. continuous data still applies!
  • For continuous data: sketch the curve of best fit using a solid line because all values in between data points are achievable
  • For discrete data: sketch the curve of best fit using a dotted line because not all values in between data points are achievable

0:00 / 0:00

Example: Curve or Line of Best Fit


a) Construct a scatter plot representing the data from the table above.
b) Sketch a line of best fit for these data points.
c) Sketch a curve of best fit for these data points.
d) Would a line of best fit or curve of best fit be better for approximating the yy value when x=60x=60? Explain.
e) Would a line of best fit or curve of best fit be better for approximating the yy value when x=15x=15? Explain.
f) Is y=5y=5 a good estimate for when the value of x=25x=25?
g) Is y=100y=100 a good estimate for when the value of x=60x=60?

Parts a), b), and c)



Parts d) and e)

In either case, the curve of best fit gives us a better approximation because it follows the data points more closely.

Part f)

According to the curve of best fit, when x=25x=25, the yy value should be somewhere close to the middle between 1 and 9. So, y=5y=5 is a good estimate.

Part g)

According to the curve of best fit, when x=60x=60, the yy value should be much larger than the 81, which is the value when x=50x=50. Although y=100y=100 is larger than 81, it's not much larger. So, y=100y=100 is NOT a good estimate.

*Consider the "y-gap" between x=40x=40 and x=50x=50, the different in yy values is 8127=5481-27=54. Seeing how the curve of best fit curves up steeper and steeper as xx increases, we know that the yy value when x=60x=60 will be larger than 81+54=13581+54=135.

Practice: Curve of Best Fit for the Height of a Ball

Kathy is trying to shoot a basketball through a hoop, and Leigh is recording the height of the basketball at various points in time.


a) What is the independent variable?

b) What is the dependent variable?

c) Draw a scatter plot to represent these data points.

d) Decide if a line or curve of best fit is better for approximating the height of this basketball. Then draw in the line or curve of best fit.

e) True or False? The height of the ball at time t=1.5t=1.5 seconds is approximately 12.1 ft.

f) True or False? A reasonable estimate for the time when the ball's height reaches h=10h=10ft is 0.7s or 3.3s.

Bonus
g) If you extend the curve of best fit to the left passed the vertical axis, we see that the curve passes through the point (0.23,0)(-0.23, 0). Interpret the real-world meaning of this point.

Practice: Curve of Best Fit for the Growth of Bacteria


Josh is growing 2 different types of bacteria -- bacteria A and bacteria B. He recorded the number (in thousands) of each type of bacteria in the table below.



a) Create a scatter plot for both types of bacteria, then sketch the curve or line of best fit for both types of bacteria.

b) According to the curves of best fit, which type of bacteria will reach 1 million first?

c) Using the curves of best fit, estimate the point in time when the number of bacteria A is approximately the same as the number of bacteria B. How many bacteria of each type are there at this point?