0:00 / 0:00

Confidence Interval for Differences in Population Means

To make inferences about the difference in the population means μ1μ2\mu_1-\mu_2 (parameter), we use the difference in the sample means xˉ1xˉ2\bar x_1-\bar x_2.

Wize Tip
Review Confidence Intervals if you need a refresher. (See: Estimating with Confidence Intervals)


We do not know the actual value of the difference in population means µ1µ2µ_1-µ_2 but we can provide a reasonable range to estimate it, by constructing confidence intervals given a certain confidence level C\colorFour C (e.g. 90%, 95%, 99%). Like all confidence interval, it is constructed using a point estimate plus or minus a margin of error.
[point estimate]±[MOE]\left[point\ estimate\right]\pm\left[MOE\right]
  • When comparing two means, the point estimate (or statistic) is x1x2\overline{x}_1-\overline{x}_2.

Example
We wish to estimate the difference in house prices between houses with fireplaces (Population 1) and houses without fireplaces (Population 2).
  • The parameter that we are trying to estimate is μ1μ2\mu_1-\mu_2.
  • μ1\mu_1 is unkown
  • μ2\mu_2 is unkown
  • Therefore, the difference μ1μ2\mu_1-\mu_2 is unkown.

PAGE BREAK
Suppose we draw samples from each population. Results:
  • The average price of homes (in thousands) with fireplaces in Sample 1 is x1=650\overline{x}_1=650
  • The average price of homes (in thousands) without fireplaces in Sample 2 is x2=500\overline{x}_2=500
  • The point estimate or statistic is x1x2=650500=150\overline{x}_1-\overline{x}_2=650-500=150
  • We estimate that the true difference in prices is $150,000.
The above is just a point estimate. Suppose we construct a 95% confidence interval. Results:
[50,200][50, 200]
  • 50 is the lower confidence level (LCL)
  • 200 is the upper confidence level (UCL)
  • We are 95% confident that the true difference in prices is between $50,000 and $200,000.
  • We do not know the value of the true difference in prices but we are very confident that it is somewhere within that confidence interval.

0:00 / 0:00

Confidence Interval for Two Means Containing Zero

Comparing two means using confidence intervals for the difference μ1μ2\mu_1 -\mu_2 is similar to a hypothesis test of no difference:

Ho:μ1μ2=0H_o:\mu_1-\mu_2=0 (no difference)
Ha:μ1μ20H_a:\mu_1-\mu_2\ne0 (difference)

Wize Concept
If the confidence interval does not contain "0", then there is evidence that the two population means differ.
If the interval contains "0", then there is no evidence that the two population means differ.

PAGE BREAK
Example #1
95% confidence interval to estimate μ1μ2\mu_1 -\mu_2 is [36,70][36, 70]
  • The confidence interval does not contain zero.
  • The LCL and UCL have the same sign (both positive).
  • There is evidence of a difference.
  • Specifically, μ1\mu_1 is greater than μ2\mu_2.



Example #2
95% confidence interval to estimate μ1μ2\mu_1 -\mu_2 is [70,36][-70, -36]
  • The confidence interval does not contain zero.
  • The LCL and UCL have the same sign (both negative).
  • There is evidence of a difference.
  • Specifically, μ1\mu_1 is less than than μ2\mu_2.



Example #3
95% confidence interval to estimate μ1μ2\mu_1 -\mu_2 is [15,45][-15, 45]
  • The confidence interval contains zero.
  • The LCL is negative and UCL is positive.
  • There is no evidence of a difference.
  • It's possible that μ1\mu_1 is greater than μ2\mu_2.
  • It's also possible that μ1\mu_1 is less than μ2\mu_2.


0:00 / 0:00

Confidence Interval for Two Means (Independent Populations)


Unequal VariancesPooled/Equal Variances(x1x2)±ts12n1+s22n2(xˉ1x2)±tsp1n1+1n2df=the minimum of: (n11,n21)df=n1+n22\begin{array}{|c|c|}\hline \text{\bf Unequal Variances} & \text{\bf Pooled/Equal Variances}\\\hline \\(\overline{x}_1-\overline{x}_2)\pm t^*\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}} & (\bar x_1-\overline{x}_2)\pm t^*s_p\sqrt{\dfrac{1}{n_1}+\dfrac{1}{n_2}}\\\\\hline \\df=\text{the minimum of: } (n_1-1,n_2-1) & df=n_1+n_2-2\\\\\hline \end{array}
We make inferences with confidence intervals.

Wize Concept
If the confidence interval does not contain "0", then there is evidence that the two population means differ.
If the interval contains "0", then there is no evidence that the two population means differ.


PAGE BREAK
Example

Ben, the owner of Fitness Planet wonders, if the amount of time spent at the gym differs by gender. His results:

Sample SizeMean Time at Gym (min)Standard DeviationMale856319Female68568\begin{array}{|c|c|c|c|}\hline & \text{Sample Size} & \text{Mean Time at Gym (min)} & \text{Standard Deviation}\\\hline \text{Male} & 85 & 63 & 19\\\hline \text{Female} & 68 & 56 & 8\\\hline \end{array}
Ben assumes that the two populations have unequal variances. He generates this output:
Comparing Means [ t-test assuming unequal variances (heteroscedastic) ]Descriptive StatisticsVARSample sizeMeanVarianceMale8563.361.Female6856.64.SummaryDegrees Of Freedom118Hypothesized Mean Difference0.E+0Test Statistics3.07318Pooled Variance229.21854Two-tailed distributionp-level0.00263t Critical Value (5%)1.98027One-tailed distributionp-level0.00132t Critical Value (5%)1.65787\begin{array}{c} \colorbox{yellow}{Comparing Means [ t-test assuming unequal variances (heteroscedastic) ]}\\\hline \begin{array}{cccc} \textit{Descriptive Statistics} & & &\\\hline \textit{VAR} & \textit{Sample size} & \textit{Mean} & \textit{Variance}\\\hline \textit{Male} & 85 & 63. & 361. \\ \textit{Female} &68 &56. &64.\\\hline\\ \textit{Summary} &&&\\\hline \textit{Degrees Of Freedom} & 118 & \textit{Hypothesized Mean Difference} & 0.E+0\\ \textit{Test Statistics} &3.07318 &\textit{Pooled Variance}& 229.21854\\\\ \textit{Two-tailed distribution} &&&\\\hline \textit{p-level} &0.00263 &\textit{t Critical Value (5\%)}& 1.98027\\\hline\\ \textit{One-tailed distribution}&&&\\\hline \textit{p-level} &0.00132 &\textit{t Critical Value (5\%)}& 1.65787\\\hline \end{array} \end{array}

What is the pp-value? What do you conclude if you were to do a hypothesis test?

Ho:μ1μ2=0H_o:\mu_1-\mu_2=0
Ha:μ1μ20H_a:\mu_1-\mu_2\ne0 (two-tail test)

The pp-value is 0.00263. We have very strong evidence that the amount time spent at the gym differs by gender.
PAGE BREAK
Construct a 95% confidence interval. What does it tell you (in plain English)?

We use:
(x1x2)±ts12n1+s22n2(\overline{x}_1-\overline{x}_2)\pm t^*\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}
df=the minimum of: (n11,n21)=the minimum of: (84,67)=67Thus, t = 1.996 (using software)\boxed{\begin{array}{l} \begin{array}{ll} df & =\text{the minimum of: } (n_1-1,n_2-1)\\ &=\text{the minimum of: }(84,67)\\ &=67 \end{array}\\ \text{Thus, $t^*$ = 1.996 (using software)} \end{array}}

(6356)±1.996(19)285+(8)268(63-56)\pm1.996\sqrt{\dfrac{(19)^2}{85}+\dfrac{(8)^2}{68}}

7±4.557\pm4.55

[2.45,11.55][2.45,11.55]

We are 95% confident that the difference of time spent at the gym is between 2.45 minutes and 11.55 minutes.

Does the interval contain "0"? Interpret that that means.

The interval does not contain "0". This means that one group (specifically the men) is consistently different (specifically greater) than the other group (specifically the women) in terms of time spent at the gym.

Thus, we can conclude that the amount time spent at the gym differs by gender.
\to This is the same conclusion as the hypothesis test.