The case-crossover design uses cases only, and compares exposures just prior to the event times to exposures at comparable control, or “referent” times, in order to assess the effect of short-term exposure on the risk of a rare event. It has commonly been used to study the effect of air pollution on the risk of various adverse health events. Proper selection of referents is crucial, especially with air pollution exposures, which are shared, highly seasonal, and often have a long term time trend. Hence, careful referent selection is important to control for time-varying confounders, and in order to ensure that the distribution of exposure is constant across referent times, a key assumption of this method. Yet the referent strategy is important for a more basic reason: the conditional logistic regression estimating equations commonly used are biased when referents are not chosen a priori and are functions of the observed event times. We call this bias in the estimating equations overlap bias. In this paper, we propose a new taxonomy of referent selection strategies in order to emphasize their statistical properties. We give a derivation of overlap bias, explore its magnitude, and consider how the bias depends on properties of the exposure series. We conclude that the bias is usually small, though highly unpredictable, and easily avoided.


Design of Experiments and Sample Surveys | Epidemiology | Statistical Models

Previous Versions

August 21, 2003