Wize AP Statistics Textbook > Inference for Two Population Proportions

Hypothesis Testing for Two Proportions

Popular Courses

AP Exam Prep

0:00 / 0:00

Comparing Two Independent Proportions

The inference two independent proportions entails comparing the proportions of successes of two populations. In this chapter, the measurement is the number of successes and failures of individuals or observations from each population. We can also compare success rates between two treatments.


PAGE BREAK

Wize Concept
A sample proportion p^\hat pp^​ is equal to the number of successes xxx divided by the sample size nnn:
p^=xn\hat p=\frac{x}np^​=nx​
You can think of "successes" as the number of those who responded "YES" in your sample.


Examples
We compare the proportion of employees (%) who got bonuses in the Marketing and IT departments (two populations).
We compare the survival rates (%) between natural remedies and medication (two treatments).

PAGE BREAK
How to identify this type of situation
There are two samples: {Sample 1, Sample 2}
Each sample is randomly drawn from different, non-overlapping populations. 
Sample 1 is drawn from Population 1 with x1x_1x1​ (number of successes in Sample 1)
Sample 2 is drawn from Population 2 with x2x_2x2​ (number of successes in Sample 2)
The two populations are independent (there is no matching of individuals or observations in the two samples)
One group could be split into two randomly assigned treatments (randomized comparative experiment), in which case we can then compare the responses (comparing two proportions).

PAGE BREAK
Let i=1,2i=1,2i=1,2.  Each of the two samples will consist of:
Sample proportion p^i=xini\displaystyle{\hat p _i=\frac{x_i}{n_i}}p^​i​=ni​xi​​
Sample size nin_ini​
It's okay if sample sizes differ as they do not need to be equal.
Central Limit Theorem applies:
If both population distributions are normal, then the sample sizes do not need to be large.
If both population distributions are not normal, then the sample sizes need to be large.
When in doubt, you should have sufficiently large sample sizes.
Summary


PAGE BREAK

Example

To make inferences about the difference in the population proportions p1−p2p_1-p_2p1​−p2​ (parameter), we use the difference in the sample means p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ (statistic). 


Wize Concept
Inferences includes hypothesis tests and confidence intervals.

0:00 / 0:00

Hypothesis Test Steps: Comparing Two Proportions
We can compare the two population proportions by running a hypothesis test. 

Wize Tip
Review Hypothesis Testing if you need a refresher of the five steps. (See: Hypothesis Testing with One Sample)

Step 1: State the hypotheses
Is this a one-sided or two-sided test?
To make inferences about the difference in the population proportions μ1−μ2\mu_1-\mu_2μ1​−μ2​ (two-sided test), the hypotheses are:
Ho: p1−p2=0Ha:p1−p2≠0H_o:\ p_1-p_2=0\\H_a:p_1-p_2\neq0Ho​: p1​−p2​=0Ha​:p1​−p2​=0
You can also test if one population proportion is greater/less than the other population proportion at either direction (one-sided test), depending on the question:
Ho: p1−p2=0Ha:p1−p2>0H_o:\ p_1-p_2=0\\H_a:p_1-p_2>0Ho​: p1​−p2​=0Ha​:p1​−p2​>0
OR
Ho: p1−p2=0Ha:p1−p2<0H_o:\ p_1-p_2=0\\H_a:p_1-p_2<0Ho​: p1​−p2​=0Ha​:p1​−p2​<0

PAGE BREAK
Step 2: Note the significance level α\alphaα
You may find the critical value, depending on α\alphaα and # of sides

Step 3: Locate the relevant variables and run the appropriate test 
Find x1, x2, n1, n2{x}_1,\ {x}_2,\ n_1,\ n_2x1​, x2​, n1​, n2​
Solve for p^1\hat p_1p^​1​and p^2\hat p_2p^​2​:
p^1=x1n1\displaystyle\boxed{\hat p _1=\frac{x_1}{n_1}}p^​1​=n1​x1​​​

p^2=x2n2\displaystyle\boxed{\hat p _2=\frac{x_2}{n_2}}p^​2​=n2​x2​​​
Test statistic for comparing two proportions:
z=p^1−p^2p^q^(1n1+1n2)\displaystyle\boxed{z=\frac{\hat p_1-\hat p_2}{\sqrt{\hat p\hat q\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}}z=p^​q^​(n1​1​+n2​1​)​p^​1​−p^​2​​​
where p^\hat pp^​ is the pooled sample proportion (or combined sample proportion):
p^=x1+x2n1+n2\displaystyle\boxed{\hat p=\frac{x_1+x_2}{n_1+n_2}}p^​=n1​+n2​x1​+x2​​​

q^=1−p^\hat q=1-\hat pq^​=1−p^​

Wize Concept
Why do we use a pooled sample proportion when comparing two proportions? 
When conducting hypothesis tests and calculating the p-value, we assume that the null hypothesis is true. Then, for comparing two proportion, we assume that p1p_1p1​ and p2p_2p2​ are equal. 
With this assumption, you can say that both p^1\hat p_1p^​1​ and p^2\hat p_2p^​2​ are estimating the same unknown proportion.
Thus, we pool the two samples to estimate a single proportion ppp instead of combining p1p_1p1​ and p2p_2p2​ separately. 

PAGE BREAK
Step 4: Find the p-value
The p-value is based on your test statistic and # of sides
If p-value <α →<\alpha\ \rightarrow<α → Reject HoH_oHo​
If p-value >α →>\alpha\ \rightarrow>α → Fail to reject HoH_oHo​

You can also compare the critical value with the test statistic
If ∣z∣>∣CV∣→\left|z\right|>\left|CV\right|\rightarrow∣z∣>∣CV∣→ Reject HoH_oHo​
If ∣z∣<∣CV∣→\left|z\right|<\left|CV\right|\rightarrow∣z∣<∣CV∣→ Fail to reject HoH_oHo​

Step 5: Draw your conclusion
If you reject HoH_oHo​, you conclude that there is evidence for HaH_aHa​.
Example: "There is evidence that the proportions differ."

If you fail to reject HoH_oHo​, you conclude that there is no evidence for HaH_aHa​.
Example: "There is no evidence that the proportions differ."

0:00 / 0:00

Example: Hypothesis Test for Differences in Population Proportions
Don King is running for mayor in Twin Pines. Based on a random sample of 60 male voters in Twin Pines, 44 of them said they view him favorably. Based on a random sample of 50 female voters in Twin Pines, 30 of them said they view him favorably. 

At the 5% significance level, is there evidence that there is a difference between male and females in terms of the proportion of those who view him favorably?

(a) Solve for p^1\hat p_1p^​1​and p^2\hat p_2p^​2​:
p^1=x1n1\displaystyle{\hat p _1=\frac{x_1}{n_1}}p^​1​=n1​x1​​

x1=44n1=60x_1=44\\n_1=60x1​=44n1​=60
p^1=4460=0.73\hat p_1=\frac{44}{60}=0.73p^​1​=6044​=0.73

p^2=x2n2\displaystyle{\hat p _2=\frac{x_2}{n_2}}p^​2​=n2​x2​​

x2=30n2=50x_2=30\\n_2=50x2​=30n2​=50
p^2=3050=0.60\hat p_2=\frac{30}{50}=0.60p^​2​=5030​=0.60

(b) Solve for the pooled sample proportion p^\hat pp^​:
p^=x1+x2n1+n2\displaystyle{\hat p=\frac{x_1+x_2}{n_1+n_2}}p^​=n1​+n2​x1​+x2​​

p^=44+3060+50=74110=0.6727\hat p=\frac{44+30}{60+50}=\frac{74}{110}=0.6727p^​=60+5044+30​=11074​=0.6727

(c) State the hypotheses.

We are testing for the difference in proportions, so this is a two-sided test.
Ho: p1−p2=0Ha:p1−p2≠0H_o:\ p_1-p_2=0\\H_a:p_1-p_2\neq0Ho​: p1​−p2​=0Ha​:p1​−p2​=0

PAGE BREAK

(d) Find the critical value z⋆z^\starz⋆.

This is a two-sided test with a 5% significance level. Although we are looking for z⋆z^\starz⋆, we will use the zzz column at the bottom of the ttt-table. (This is easier than using the z-table.)

z⋆=1.96z^\star=1.96z⋆=1.96

PAGE BREAK
(e) Solve for the test statistic for comparing two proportions:
z=p^1−p^2p^q^(1n1+1n2)\displaystyle\boxed{z=\frac{\hat p_1-\hat p_2}{\sqrt{\hat p\hat q\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}}z=p^​q^​(n1​1​+n2​1​)​p^​1​−p^​2​​​

z=0.73−0.60(0.6727)(1−0.6727)(160+150)=1.484z=\frac{0.73-0.60}{\sqrt{\left(0.6727\right)\left(1-0.6727\right)\left(\frac{1}{60}+\frac{1}{50}\right)}}=1.484z=(0.6727)(1−0.6727)(601​+501​)​0.73−0.60​=1.484

PAGE BREAK

(f) What is the p-value?

We will use the z-table to find the p-value. 

Watch Out!
For two-sided tests, don't forget to double the p-value! Many students forget!

p-value =2⋅(1−0.9306)=0.1388=2\cdot(1-0.9306)=0.1388=2⋅(1−0.9306)=0.1388

PAGE BREAK

(g) Draw your conclusion. 

p-value >α→>\alpha\rightarrow >α→ Fail to reject HoH_oHo​
You can also say that the absolute value of the test statistic (z=1.484)\left(z=1.484\right)(z=1.484) is greater than the absolute value of the critical value ((z∗=1.96)\left(z^{\ast}=1.96\right)(z∗=1.96)→\rightarrow → Fail to reject HoH_oHo​
At the 5% significance level, there is no evidence that there is a difference between male and females in terms of the proportion of those who view him favorably.

There are two restaurants at the amusement park: Hucklebee's Tavern (Population 1) and Cafe Isabelle (Population 2). Random samples are taken from each restaurant so we can the proportions of customers that tip. Here are the results:  

At the 5% significance level, we want to determine if the proportion of customers that tip differ between the two restaurants.

(i) State the hypotheses.

Ho:p1−p2=0Ha:p1−p2>0H_o:p_{1}-p_2=0\\ H_a:p_{1}-p_{2}>0Ho​:p1​−p2​=0Ha​:p1​−p2​>0 

Ho:p1−p2=0Ha:p1−p2<0H_o:p_{1}-p_2=0\\ H_a:p_{1}-p_{2}<0Ho​:p1​−p2​=0Ha​:p1​−p2​<0

Ho:p1−p2=0Ha:p1−p2≠0H_o:p_{1}-p_2=0\\ H_a:p_{1}-p_{2}\neq0Ho​:p1​−p2​=0Ha​:p1​−p2​=0

I don't know