Sample Size Determination for a Survey
From ICE Primer: A Tobacco Control Research Methodology Primer
Sample size determination and sample allocation to subgroups are two of the most important decisions in the design of a survey. Generally, the funding, personnel and time available for the survey constrain the sample size. However, at the time of applying for funding the investigator should consider whether the contemplated sample sizes, overall and for subgroups, will provide the precision and statistical power needed to address the research questions.
If the sampling design is to be simple random sampling from a very large population, the half-width of a
confidence interval for a proportion p not too close to 0 or 1 has the formula
where
is the estimate of p from the sample, and Z1 − α / 2 is the critical point of the standard normal variable Z for a two-sided test at
level of significance. The maximum value of the half-width is achieved when
. Thus if n = 1000,, a 95% confidence interval for the proportion p is of form
and the half-width is at most
or 0.031. When the proportion is expressed as a percentage, the half-width of the 95% confidence interval is at most 3.1 percentage points.
Using the general simple random sampling formula, when we know beforehand an upper bound on the proportion of interest, we can use it in calculating precision. As an example, if we want to estimate smoking prevalence in a population from a simple random sample, and we consider 37% to be an upper limit, we can say that a sample of size n = 7500 would give us the estimate to within plus or minus
, or 1.1 percentage points, with 95% confidence.
Usually, the sampling design is not simple random sampling, but a stratified multistage sampling of households, followed by the selection of individuals within households. For example, the sample size of 7500 persons might be obtained from a sample of 2500 households, from a complex design. Then the respondents cannot be deemed to be independent, because of their clustering within households, and because the households themselves may be sampled from neighbourhoods or villages selected at a prior stage. Sample size determination must then take into account the design effect, or the factor by which the true sampling design changes (usually increases) the variance of a variable of interest, from the variance under simple random sampling with the same number of respondents. The design effect depends in general on the design specification and on the variable of interest. It is then necessary to assume a range of plausible values of the design effect. In the previous example, if we assume 0.25 for the intra-household correlation ρ, and 3 respondents interviewed per household, and a design effect for the household sample of 1.4, the half-width of the 95% confidence interval becomes
or 1.6 percentage points.
As another example, consider a longitudinal survey, where the sample is expected to yield 1460 cigarette smokers at baseline (Wave 1), and the wave-to-wave retention rate is expected to be 0.8 or 80%, so that there will be
smokers present at both Waves 1 and 2. Suppose we are interested in estimating the proportion of cigarette smokers making a quit attempt between Waves 1 and 2, and suppose the proportion is expected to be around 6%. Assume an intra-household correlation for quit attempts of smokers (with between 1 and 2 per household) of 0.1, and a design effect of 1.4.
Then the 95% confidence interval for the proportion making a quit attempt will have half-width about
or 1.7 percentage points.
Plausible design effects can be obtained from previous surveys of the same type, by considering the ratio of variance estimates taking the design into account to variance estimates assuming simple random sampling of the same size. For confidence intervals for quantities such as regression coefficients or correlations, this kind of calculation is most helpful.
Many reviewers of applications for funding expect not only precision estimates but power calculations. Power calculations are very sensitive to model assumptions and should be interpreted with caution. That being said, reviewers may find helpful statements like the following, in the context of GEE modeling of a binary repeated measure in a longitudinal survey of two waves:
"Assume a design inflation (from household clusters and multistage sampling) of standard errors of 1.45, as seen in a similar analysis [citation]. Then the effective sample size is approximately halved, since the design effect is 1.452 = 2.10. Assume also a correlation of 0.4 for the Wave 1 and Wave 2 responses of an individual. With a two-sided test of significance at the 5% level there would be approximately 80% power to reject the null hypothesis
when β = 0.154, or the odds ratio exp(β) = 1.42."
Carrying out this calculation rigorously requires effectively simulating the observations assuming β = 0.154 and a correlation of 0.4, and verifying that the two sided significance test at the 5% level will reject the null hypothesis 80% of the time.
