The presence of missing observations is a challenge in statistical analysis especially when data are clustered. In this paper, we develop a Bayesian model averaging (BMA) approach for imputing missing observations in clustered data. Our approach extends BMA by allowing the weights of competing regression models for missing data imputation to vary between clusters while borrowing information across clusters in estimating model parameters. Through simulation and cross-validation studies, we demonstrate that our approach outperforms the standard BMA imputation approach where model weights are assumed to be the same for all clusters. We then apply our proposed method to a national dataset of daily ambient coarse particulate matter (PM10-2.5) concentration between 2003 and 2005. We impute missing daily monitor-level PM10-2.5 measurements and estimate the posterior probability of PM10-2.5 nonattainment status for 95 US counties based on the Environmental Protection Agency's proposed 24-hour standard.
Longitudinal Data Analysis and Time Series
Chang, Howard H.; Dominici, Francesca; and Peng, Roger D., "Bayesian Model Averaging for Clustered Data: Imputing Missing Daily Air Pollution Concentration" (December 2008). Johns Hopkins University, Dept. of Biostatistics Working Papers. Working Paper 177.