Randomized trials remain the most accepted design for estimating the effects of interventions, but they do not necessarily answer a question of primary interest: Will the program be effective in a target population in which it may be implemented? In other words,are the results generalizable? There has been very little statistical research on how to assess the generalizability, or "external validity," of randomized trials. We propose the use of propensity-score-based metrics to quantify the similarity of the participants in a randomized trial and a target population. In this setting the propensity score model predicts participation in the randomized trial, given a set of covariates. The resulting propensity scores are used first to quantify the difference between the trial participants and the target population, and then to weight the control group outcomes to the population, assessing how well the weighted outcomes track the outcomes actually observed in the population. These metrics can serve as a first step in assessing the generalizability of results from randomized trials to target populations. This paper lays out these ideas, discusses the assumptions underlying the approach, and illustrates the metrics using data on the evaluation of a schoolwide prevention program called Positive Behavioral Interventions and Supports.
Stuart, Elizabeth A.; Cole, Stephen R.; Bradshaw, Catherine P.; and Leaf, Philip J., "THE USE OF PROPENSITY SCORES TO ASSESS THE GENERALIZABILITY OF RESULTS FROM RANDOMIZED TRIALS" (May 2010). Johns Hopkins University, Dept. of Biostatistics Working Papers. Working Paper 210.