You are here: Project Scarlet » Probability and Distributions » Hypergeometric Distribution
Discrete Distributions
The Hypergeometric distribution is generated from a process similar to that of the Binomial distribution. In the Binomial distribution, there are five requirements. These can be met in two ways. First, by randomly sampling from an infinite population. Second, by sampling sequentially, replacing the sampled item after recording the measurement.
For the Hypergeometric distribution, neither of these occur. One samples from a finite population without replacing the unit. Most polling data is actually Hypergeometric data — not Binomial — because polling firms sample from what is actually a finite (albeit large) population without allowing for re-sampling a unit.
With that said, the difference in point estimates is non-existant. Regardless of whether the random variable is treated as a Binomial random variable or a Hypergeometric random variable, the expected value is identical. Additionally, the difference in confidence intervals is slight for population sizes on the order of most state populations. As such, assuming Hypergeometric data are Binomial data introduces little bias in precision.
The Hypergeometric distribution has three parameters. These are the population size, N, the sample size, n, and the number of successes in the population, K. Note that there are restrictive relationships here. For instance, K < N always. Also, n < N. Finally, if we define x as the number of successes in the sample, x can be no smaller than the larger of 0 and n + K − N. It can be no larger than the either N or K.
Please select the aspect of the Hypergeometric distribution you would like to work with:
Here is a graphic of a sample Hypergeometric distribution:
In this example,
X ~ H(n=5, N=31, K=11)
The sample space is listed across the bottom, S = {0, 1, 2, … 5}. The height of each bar represents the probability of that elementary event.
© Ole J. Forsberg, Ph.D. 2024. All rights reserved. | . | |