Population: The totality of elements of interest; the “whole” from which the sample is selected.
Sample: A part of the population.
Population size, N: The number of elements in the population.
Sample size, n: The number of elements in the sample.
The purpose of sampling is to obtain estimates of unknown population characteristics.
The population characteristics of greatest interest in practice are the mean and total of a variable, and the proportion and number of elements in a category.
A sample may be selected randomly or by some other method (e.g., arbitrary, judgmental, “quota”).
There are several kinds of random samples: simple, stratified, two-stage, etc. Sampling may be with or (commonly) without replacement.
To select a simple random sample it is necessary to have a list of the population elements. One method is to (a) prepare N tags; (b) number each with the element ID No. ; (c) place the tags in bowl, mix thoroughly; (d) select n elements with/without replacement. Other and easier to implement methods use (a) 10 (rather than N) tags to form ID Nos; (b) tables of random numbers; or (c) computer-generated random numbers.
One appealing feature of simple random sampling is the lack of bias in the selection of the sample.
After a simple random sample is selected and the selected elements measured or interviewed, it is reasonable to use the (ordinary) sample mean of a variable as an estimate of the unknown population mean of that variable, and the (ordinary) proportion of sampled elements in a category as an estimate of the proportion of elements in the population in the same category. To estimate the population totals or numbers, multiply these estimates by the population size, N. We refer to these as the “sensible estimates.”
It can never be guaranteed (with certainty) that an estimate will be equal to, or within a given interval from the corresponding population characteristic (unless, of course, sampling is with replacement and the sample size equals the population size).
Imagine an experiment whereby a large number of simple random samples of the same size are selected from a known population, and from each such sample a sensible estimate is made of the population characteristic. It will be observed that the average value of these sensible estimates is equal to the population characteristic. In other words, in the long run and on average the sensible estimate neither under-estimates nor over-estimates, but is equal to the population characteristic. An estimate with this property is called an unbiased estimate.
The property can be confirmed mathematically to hold in general. That is, the ordinary sample mean (total) and proportion (number) are unbiased estimates of the corresponding population mean (total) and proportion (number). Unbiasedness is considered a highly desirable property for an estimate to have.
It can be shown, however, that the sensible estimates are not unbiased if the random sample is other than simple. It is possible to construct unbiased estimates for each kind of random sample (e.g., stratified, two-stage, etc.). Unlike the sensible estimates, these unbiased estimates attach unequal weights to the sample observations. If interested to know how these estimates are constructed, consult any sampling text.
Stratified and two-stage random sampling require that the population elements be divided into groups according to some criterion. The sample is called stratified if a simple random sample of elements is selected from each and every group. The sample is called two-stage if, first, a simple random sample of groups is selected, then a simple random of elements is drawn from each selected group.