Testing a single proportion

To test a single proportion using the infer package, we will use the gss (General Social Survey) dataset. In this example, we will test whether the proportion of college graduates in the population is equal to 30%.

The process follows the four main verbs of the infer pipeline: specify(), hypothesize(), generate(), and calculate().

The Hypothesis Test

library(infer)
library(dplyr)

# 1. Calculate the observed statistic from the data
obs_stat <- gss |>
  specify(response = college, success = "degree") |>
  calculate(stat = "prop")

# 2. Generate the null distribution
null_distribution <- gss |>
  specify(response = college, success = "degree") |>
  hypothesize(null = "point", p = 0.30) |>
  generate(reps = 1000, type = "draw") |>
  calculate(stat = "prop")

# 3. Visualize the results
null_distribution |>
  visualize() +
  shade_p_value(obs_stat = obs_stat, direction = "two-sided")

# 4. Get the p-value
null_distribution |>
  get_p_value(obs_stat = obs_stat, direction = "two-sided")
# A tibble: 1 × 1
  p_value
    <dbl>
1   0.016