This document is prepared automatically using the following R command.
This lesson explains how to conduct a hypothesis test to determine whether the difference between two proportions is significant.
The test procedure, called the two-proportion z-test, is appropriate when the following conditions are met:
The sampling method for each population is simple random sampling.
The samples are independent.
Each sample includes at least 10 successes and 10 failures.
Each population is at least 20 times as big as its sample.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
Since the above requirements are satisfied, we can use the following four-step approach to construct a confidence interval.
The first step is to state the null hypothesis and an alternative hypothesis.
Null hypothesis(H0):P1≦ Alternative\ hypothesis(H_1): P_1 > P_2
Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected if the proportion from population 1 is too big..
For this analysis, the significance level is 0.05`. The test method, shown in the next section, is a two-proportion z-test.
Using sample data, we calculate the pooled sample proportion (p) and the standard error (SE). Using those measures, we compute the z-score test statistic (z).
p=\frac{p_1 \times n_1+ p_2 \times n_2}{n1+n2} p=\frac{0.71 \times 150+ 0.63 \times 100}{150+100}
p=169.5/250=0.678
SE=\sqrt{p\times(1-p)\times[1/n_1+1/n_2]}
SE=\sqrt{0.678\times0.322\times[1/150+1/100]}=0.061
z=\frac{p_1-p_2}{SE}=\frac{0.71-0.63}{0.061}=1.33
where p_1 is the sample proportion in sample 1, where p_2 is the sample proportion in sample 2, n_1 is the size of sample 1, and n_2 is the size of sample 2.
Since we have a one-tailed test, the P-value is the probability that the z statistic is or greater than 1.33.
We can use following R code to find the p value.
p=pnorm(1.33,lower.tail=FALSE)=0.092
Alternatively,we can use the Normal Distribution curve to find p value.
draw_n(z=x$result$z,alternative=x$result$alternative)
Since the P-value (0.092) is greater than the significance level (0.05), we cannot reject the null hypothesis.
$data
# A tibble: 1 × 2
x y
<lgl> <lgl>
1 NA NA
$result
alpha p1 p2 n1 n2 DF pd se critical ME lower
1 0.05 0.71 0.63 150 100 248 0.08 0.06085776 1.644854 0.1001021 -0.0201021
upper CI ppooled sepooled z pvalue
1 0.1801021 0.08 [95CI -0.02; 0.18] 0.678 0.06032081 1.326242 0.09237975
alternative
1 greater
$call
propCI(n1 = 150, n2 = 100, p1 = 0.71, p2 = 0.63, P = 0, alternative = "greater")
attr(,"measure")
[1] "propdiff"
The contents of this document are modified from StatTrek.com. Berman H.B., “AP Statistics Tutorial”, [online] Available at: https://stattrek.com/hypothesis-test/difference-in-proportions.aspx?tutorial=AP URL[Accessed Data: 1/23/2022].