On Estimation of Vaccine Efficacy Using Validation Samples with Selection Bias

Daniel O. Scharfstein, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
M. Elizabeth Halloran, Department of Biostatistics, Rollins School of Public Health, Emory University
Haitao Chu, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health
Michael J. Daniels, Department of Statistics, University of Florida


Using validation sets for outcomes can greatly improve the estimation of vaccine efficacy (VE) in the field (Halloran and Longini 2001; Halloran et al. 2003). Most statistical methods for using validation sets rely on the assumption that outcomes on those with no cultures are missing at random. However, often the validation sets will not be chosen at random. For example, confirmational cultures are often done on people with influenza-like illness as part of routine influenza surveillance. VE estimates based on such non-MAR validation sets could be biased. Here we propose frequentists and Bayesian approaches for estimating vaccine efficacy in the presence of validation bias. Our work builds on the ideas of Rotnitzky et al. (1998,2001), Scharfstein et al. (1999,2003) and Robins et al. (2000). Our methods require expert opinion about the nature of the validation selection bias. In a re-analysis of an influenza vaccine study, we found, using the beliefs of a flu expert, that, within any plausible range of selection bias, the VE estimate based on the validation sets is much higher than the point estimate using just the nonspecific case definition. Our approach is generally applicable to studies with missing binary outcomes with categorical covariates.