Group (cluster) randomized trials (e.g. school and workplace settings)
From ICE Primer: A Tobacco Control Research Methodology Primer
In randomized clinical trials of a therapeutic intervention such as a drug, individual subjects are randomized to different conditions. For non-therapeutic interventions that are delivered in schools (school-based studies), workplaces (workplace studies), medical practices (clinical interventions , or entire communities (community interventions) etc., the randomization of intact clusters of subjects to conditions may be required for several reasons, both practical and scientific. These reasons include:
- possible contamination of treatment conditions if individuals within the cluster talk to one another
- possible contamination of treatment conditions if the same provider is responsible for delivering both the intervention and the control condition
- the method of delivering the intervention may require all subjects in a physical location (e.g. clasroom or clinic) to receive the same intervention
- delivery of intervention to intact clusters minimizes confusion within organizations
- if the intervention is eventually to be delivered at the level of the cluster, the results of a group randomized trial will be more generalized
- possibility of studying the effects of factors at the cluster level, as well as at the level of the individual, on the response
- possibility of looking at interaction of cluster-level factors and the intervention
There are several consequences of randomizing intact clusters for both the design and the analysis of a group randomized trial. Responses from individuals in intact groups may be similar to each other, but different from responses from individuals in other groups. For example, there may be underlying factors related to the response that are similar within a cluster but different between clusters (e.g. income); individuals within a cluster may communicate; there may be a provider effect so that individuals taught by the same provider may respond similarly, etc. Whatever the reason, the consequence is that the individuals within a cluster should not be treated as independent subjects. This has implications both for sample size calculations and for the application of statistical methods (e.g. multi variant regression analysis, logistic regression analysis) since standard methods in both cases generally assume observations are independent from subject-to-subject. Sample size calculations will need to take into consideration the degree of intra-class correlation between subjects within clusters, and the resulting sample sizes will generally be larger than those required for a standard randomized clinical trial of a therapeutic intervention. In general, using more clusters with fewer subjects per cluster will be more efficient than fewer clusters with more subjects per cluster.
Analysis methods likewise need to account for this within cluster correlation. If ignored, the resulting standard errors for estimates of the treatment effect will be too small, and the significance of treatment effects will be overestimated. Methods designed to deal with correlated responses (e.g. multi-level modeling, generalized estimating equations) will need to be employed.
Missing data can occur at the subject level due to subject drop out or loss to follow-up, or at the cluster level if a school or other setting withdraws from the study. In adjusting for subject drop out, the within cluster correlation should be considered. Cluster-level drop-out could greatly reduce the number of subjects available for analysis.
