The decimal point is 1 digit(s) to the left of the |
20 | 0000
19 | 0000
18 | 0000000
17 | 000
16 |
15 | 0
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 | 0
2024-10-14
Week SEVEN
Difference between \(\hat{p}\) and \(p\), where \(\hat{p}\) is what we observe and \(p\) is what we believe to be true in general
We form a hypothesis about \(p\) and observe \(\hat{p}\) as evidence
We know \(\hat{p}\) so we don’t need to form a hypothesis about it. It’s the sample proportion we observe. It’s a tool to make an inference about \(p\), the actual proportion. The hypothesis is an assertion about something we don’t know for certain. We’ll use the evidence (\(\hat{p}\)) plus what we know about distributions to verify or invalidate the assertion.
Note: The book uses \(p\) in two ways: as a \(p\)-value in a hypothesis test, and as \(p\), a population proportion. Related to the population proportion is the sample proportion, \(\hat{p}\), pronounced p-hat. Too many \(p\)s!
The sampling distribution for \(\hat{p}\) based on a sample of size \(n\) from a population with a true proportion \(p\) is nearly normal when:
When these conditions are met, then the sampling distribution of \(\hat{p}\) is nearly normal with mean \(p\) and standard error \(\text{SE}=\sqrt{p(1-p)/n}\).
A confidence interval provides a range of plausible values for the parameter \(p\), and when \(\hat{p}\) can be modeled using a normal distribution, the confidence interval for \(p\) takes the form
\[ \hat{p} \pm z^{*} \times \text{SE} \]
where \(z^{*}\) marks the \(x\)-axis for the selected confidence interval, e.g., 1.96 for a 95 percent confidence interval.
The textbook, Diez, Çetinkaya-Rundel, and Barr (2019), recommends a four step cycle for both confidence intervals and hypothesis tests. It differs a bit from the seven step hypothesis testing method given in Week 06, but achieves the same result.
Same cycle for hypothesis testing for a proportion
This is probably the most important part of this section for practical purposes. The following expression denotes the margin of error:
\[ z^{*}\sqrt{\frac{p(1-p)}{n}} \]
You have to choose the margin of error you want to report. The book gives an example of 0.04. So you want to find
\[ z^{*}\sqrt{\frac{p(1-p)}{n}} < 0.04 \]
The problem is that you don’t know \(p\). Since the worst-case scenario is \(p=0.5\), you have to use that unless you have some information about \(p\). Recall that \(z^{*}\) represents the \(z\)-score for the desired confidence level, so you have to choose that. The book gives an example where you want a 95 percent confidence level, so you choose 1.96. You could find this out in R by saying
returning the \(z\)-score for the upper tail.
The reason for saying that 0.025 instead of 0.05 is that the probability of 0.05 is split between the tails. The complementary function is pnorm(1.959964,lower.tail=FALSE)
, which will return 0.025.
Once you have decided on values for \(p\) and \(z^*\), solve the preceding inequality for \(n\).
In this section, we’re just modifying the previous section to account for a difference instead of a single proportion.
The difference \(\hat{p}_1-\hat{p}_2\) can be modeled using the normal distribution when
\[ \begin{align} \text{point estimate } &\pm z^* \times \text{SE} \\ \rightarrow (\hat{p}_1 - \hat{p}_2) &\pm z^* \times \sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}} \end{align} \]
When the null hypothesis is that the proportions are equal, use the pooled proportion (\(\hat{p}_\text{pooled}\)) to verify the success-failure condition and estimate the standard error
\[ \hat{p}_\text{pooled} = \frac{\text{number of “successes”}}{\text{number of cases}} = \frac{\hat{p}_1n_1+\hat{p}_2n_2}{n_1+n_2} \]
Here \(\hat{p}_1n_1\) represents the number of successes in sample 1 since
\[ \hat{p}_1 = \frac{\text{number of successes in sample 1}}{n_1} \]
Similarly, \(\hat{p}_2n_2\) represents the number of successes in sample 2.
The \(\chi^2\) test, pronounced k\(\overline{\text{i}}\) square, is useful in many circumstances. The textbook treats two such circumstances:
For the first circumstance, we could divide a sample of people into races or genders and we would like to examine all at once for resemblance to the general population, rather than in pairs. The \(\chi^2\) statistic will permit an all-at-once comparison.
The \(\chi^2\) statistic is given by the following formula for \(g\) groups.
\[ \chi^2 = \frac{(\text{observed count}_1-\text{null count}_1)^2}{\text{null count}_1} + \cdots + \frac{(\text{observed count}_g-\text{null count}_g)^2}{\text{null count}_g} \]
where the expression null count refers to the expected number of objects in the group.
You have to be careful about how you determine the null count. For instance, the textbook gives an example of races of jurors. In such a case, the null counts should come from the population who can be selected as jurors. This might be a matter of some dispute since jurors are usually recruited through voting records and these records may not reflect the correct proportions. Statisticians tend to like things they can count, and some people are harder (more expensive) to count than others, particularly people in marginalized populations.
The \(\chi^2\) distribution is sometimes used to characterize data sets and statistics that are always positive and typically right skewed. Recall a normal distribution had two parameters – mean and standard deviation – that could be used to describe its exact characteristics. The \(\chi^2\) distribution has just one parameter called degrees of freedom (\(df\)), which influences the shape, center, and spread of the distribution.
Here is a picture of the \(\chi^2\) distribution for several values of \(df\) (1–8).
In the jurors example, we can calculate the appropriate \(p\)-value in R by using the \(\chi^2\) statistic calculated from the sample, 5.89, and the parameter \(df = k-1\) which is the number of groups minus one, using R:
This is a relatively large \(p\)-value given our earlier choices of cutoffs of 0.1, 0.05, and 0.01.
Consequently, we fail to reject the null hypothesis, which is that no racial bias is in evidence in juror selection. Bear in mind that this finding relies on our belief about the proportions in the population, which may have been systematically miscounted in the case of marginalized populations.
The \(\chi^2\) test can be conducted in R for the juror example given in the book as follows.
Chi-squared test for given probabilities
data: o
X-squared = 5.8896, df = 3, p-value = 0.1171
Note that I had to make an adjustment in R. The R variable p
is supposed to be a vector of probabilities summing to 1. The way the table in the book presented it, it was not a vector of probabilities summing to one. So I divided each element of the input vector for e
by the sum of the vector o
.
Suppose you have a two way table. Datacamp gives an example of gender and sport as the two ways. The following data frame lists the number of males and females who like the following three sports: archery, boxing, and cycling.
female <- c(35,15,50)
male <- c(10,30,60)
df <- cbind(male,female)
rownames(df) <- c("archery","boxing","cycling")
df
male female
archery 10 35
boxing 30 15
cycling 60 50
Pearson's Chi-squared test
data: df
X-squared = 19.798, df = 2, p-value = 5.023e-05
The \(\chi^2\) tests suggests that the genders are not independent for the three sports, meaning that the preferences may differ by gender.
END
This slideshow was produced using quarto
Fonts are Roboto Condensed Bold, JetBrains Mono Nerd Font, and STIX2