0:00 / 0:00

Comparing Two Independent Proportions


The inference two independent proportions entails comparing the proportions of successes of two populations. In this chapter, the measurement is the number of successes and failures of individuals or observations from each population. We can also compare success rates between two treatments.


PAGE BREAK

Wize Concept
A sample proportion p^\hat p is equal to the number of successes xx divided by the sample size nn:
p^=xn\hat p=\frac{x}n
You can think of "successes" as the number of those who responded "YES" in your sample.


Examples
  • We compare the proportion of employees (%) who got bonuses in the Marketing and IT departments (two populations).
  • We compare the survival rates (%) between natural remedies and medication (two treatments).

PAGE BREAK

How to identify this type of situation

  1. There are two samples: {Sample 1, Sample 2}
  2. Each sample is randomly drawn from different, non-overlapping populations.
  • Sample 1 is drawn from Population 1 with x1x_1 (number of successes in Sample 1)
  • Sample 2 is drawn from Population 2 with x2x_2 (number of successes in Sample 2)
  1. The two populations are independent (there is no matching of individuals or observations in the two samples)
  • One group could be split into two randomly assigned treatments (randomized comparative experiment), in which case we can then compare the responses (comparing two proportions).

PAGE BREAK
  1. Let i=1,2i=1,2. Each of the two samples will consist of:
  • Sample proportion p^i=xini\displaystyle{\hat p _i=\frac{x_i}{n_i}}
  • Sample size nin_i
  • It's okay if sample sizes differ as they do not need to be equal.
  • Central Limit Theorem applies:
  • If both population distributions are normal, then the sample sizes do not need to be large.
  • If both population distributions are not normal, then the sample sizes need to be large.
  • When in doubt, you should have sufficiently large sample sizes.
Summary


PAGE BREAK

Example

To make inferences about the difference in the population proportions p1p2p_1-p_2 (parameter), we use the difference in the sample means p^1p^2\hat p_1-\hat p_2 (statistic).


Wize Concept
Inferences includes hypothesis tests and confidence intervals.

0:00 / 0:00

Hypothesis Test Steps: Comparing Two Proportions

We can compare the two population proportions by running a hypothesis test.

Wize Tip
Review Hypothesis Testing if you need a refresher of the five steps. (See: Hypothesis Testing with One Sample)

  • Step 1: State the hypotheses
  • Is this a one-sided or two-sided test?
  • To make inferences about the difference in the population proportions μ1μ2\mu_1-\mu_2 (two-sided test), the hypotheses are:
Ho: p1p2=0Ha:p1p20H_o:\ p_1-p_2=0\\H_a:p_1-p_2\neq0
  • You can also test if one population proportion is greater/less than the other population proportion at either direction (one-sided test), depending on the question:
Ho: p1p2=0Ha:p1p2>0H_o:\ p_1-p_2=0\\H_a:p_1-p_2>0
OR
Ho: p1p2=0Ha:p1p2<0H_o:\ p_1-p_2=0\\H_a:p_1-p_2<0

PAGE BREAK
  • Step 2: Note the significance level α\alpha
  • You may find the critical value, depending on α\alpha and # of sides
  • Step 3: Locate the relevant variables and run the appropriate test
  • Find x1, x2, n1, n2{x}_1,\ {x}_2,\ n_1,\ n_2
  • Solve for p^1\hat p_1and p^2\hat p_2:
p^1=x1n1\displaystyle\boxed{\hat p _1=\frac{x_1}{n_1}}

p^2=x2n2\displaystyle\boxed{\hat p _2=\frac{x_2}{n_2}}
  • Test statistic for comparing two proportions:
z=p^1p^2p^q^(1n1+1n2)\displaystyle\boxed{z=\frac{\hat p_1-\hat p_2}{\sqrt{\hat p\hat q\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}}
  • where p^\hat p is the pooled sample proportion (or combined sample proportion):
p^=x1+x2n1+n2\displaystyle\boxed{\hat p=\frac{x_1+x_2}{n_1+n_2}}

  • q^=1p^\hat q=1-\hat p

Wize Concept
Why do we use a pooled sample proportion when comparing two proportions?
  • When conducting hypothesis tests and calculating the p-value, we assume that the null hypothesis is true. Then, for comparing two proportion, we assume that p1p_1 and p2p_2 are equal.
  • With this assumption, you can say that both p^1\hat p_1 and p^2\hat p_2 are estimating the same unknown proportion.
  • Thus, we pool the two samples to estimate a single proportion pp instead of combining p1p_1 and p2p_2 separately.

PAGE BREAK
  • Step 4: Find the p-value
  • The p-value is based on your test statistic and # of sides
  • If p-value <α <\alpha\ \rightarrow Reject HoH_o
  • If p-value >α >\alpha\ \rightarrow Fail to reject HoH_o
  • You can also compare the critical value with the test statistic
  • If z>CV\left|z\right|>\left|CV\right|\rightarrow Reject HoH_o
  • If z<CV\left|z\right|<\left|CV\right|\rightarrow Fail to reject HoH_o
  • Step 5: Draw your conclusion
  • If you reject HoH_o, you conclude that there is evidence for HaH_a. Example: "There is evidence that the proportions differ."
  • If you fail to reject HoH_o, you conclude that there is no evidence for HaH_a. Example: "There is no evidence that the proportions differ."


0:00 / 0:00

Example: Hypothesis Test for Differences in Population Proportions

Don King is running for mayor in Twin Pines. Based on a random sample of 60 male voters in Twin Pines, 44 of them said they view him favorably. Based on a random sample of 50 female voters in Twin Pines, 30 of them said they view him favorably.

At the 5% significance level, is there evidence that there is a difference between male and females in terms of the proportion of those who view him favorably?

(a) Solve for p^1\hat p_1and p^2\hat p_2:
p^1=x1n1\displaystyle{\hat p _1=\frac{x_1}{n_1}}
x1=44n1=60x_1=44\\n_1=60
p^1=4460=0.73\hat p_1=\frac{44}{60}=0.73

p^2=x2n2\displaystyle{\hat p _2=\frac{x_2}{n_2}}
x2=30n2=50x_2=30\\n_2=50
p^2=3050=0.60\hat p_2=\frac{30}{50}=0.60

(b) Solve for the pooled sample proportion p^\hat p:
p^=x1+x2n1+n2\displaystyle{\hat p=\frac{x_1+x_2}{n_1+n_2}}

p^=44+3060+50=74110=0.6727\hat p=\frac{44+30}{60+50}=\frac{74}{110}=0.6727

(c) State the hypotheses.

We are testing for the difference in proportions, so this is a two-sided test.
Ho: p1p2=0Ha:p1p20H_o:\ p_1-p_2=0\\H_a:p_1-p_2\neq0

PAGE BREAK

(d) Find the critical value zz^\star.

This is a two-sided test with a 5% significance level. Although we are looking for zz^\star, we will use the zz column at the bottom of the tt-table. (This is easier than using the z-table.)


z=1.96z^\star=1.96

PAGE BREAK
(e) Solve for the test statistic for comparing two proportions:
z=p^1p^2p^q^(1n1+1n2)\displaystyle\boxed{z=\frac{\hat p_1-\hat p_2}{\sqrt{\hat p\hat q\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}}

z=0.730.60(0.6727)(10.6727)(160+150)=1.484z=\frac{0.73-0.60}{\sqrt{\left(0.6727\right)\left(1-0.6727\right)\left(\frac{1}{60}+\frac{1}{50}\right)}}=1.484

PAGE BREAK

(f) What is the p-value?

We will use the z-table to find the p-value.

Watch Out!
For two-sided tests, don't forget to double the p-value! Many students forget!

p-value =2(10.9306)=0.1388=2\cdot(1-0.9306)=0.1388

PAGE BREAK

(g) Draw your conclusion.
  • p-value >α>\alpha\rightarrow Fail to reject HoH_o
  • You can also say that the absolute value of the test statistic (z=1.484)\left(z=1.484\right) is greater than the absolute value of the critical value ((z=1.96)\left(z^{\ast}=1.96\right)\rightarrow Fail to reject HoH_o
At the 5% significance level, there is no evidence that there is a difference between male and females in terms of the proportion of those who view him favorably.


There are two restaurants at the amusement park: Hucklebee's Tavern (Population 1) and Cafe Isabelle (Population 2). Random samples are taken from each restaurant so we can the proportions of customers that tip. Here are the results:


At the 5% significance level, we want to determine if the proportion of customers that tip differ between the two restaurants.


(i) State the hypotheses.