Leah J. Welty, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine
Scott L. Zeger, The Johns Hopkins Bloomberg School of Public Health
Francesa Dominici, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health


A distributed lag model (DLM) is a regression model that includes lagged exposure variables as covariates; its corresponding distributed lag (DL) function describes the relationship between the lag and the coefficient of the lagged exposure variable. DLMs have recently been used in environmental epidemiology for quantifying the cumulative effects of weather and air pollution on mortality and morbidity. Standard methods for formulating DLMs include unconstrained, polynomial, and penalized spline DLMs. These methods may fail to take full advantage of prior information about the shape of the DL function for environmental exposures, or for any other exposure with effects that are believed to smoothly approach zero as lag increases, and are therefore at risk of producing sub-optimal estimates.

In this paper we propose a Bayesian DLM (BDLM) that incorporates prior knowledge about the shape of the DL function and also allows the degree of smoothness of the DL function to be estimated from the data. In a simulation study, we compare our Bayesian approach with alternative methods that use unconstrained, polynomial and penalized spline DLMs. We also show that BDLMs encompass penalized spline DLMs: under certain assumptions, imposing a prior on the DL coefficients is analogous to smoothing the DL coefficients with a penalty specified by the prior. We apply our BDLM to data from the National Morbidity, Mortality, and Air Pollution Study (NMMAPS) to estimate the short term health effects of particulate matter air pollution on mortality from 1987--2000 for Chicago, Illinois. Software for fitting BDLM models and the data used in this paper are available online.