The Effect of Retrospective Sampling on Binary Regression Models for Clustered Data


Published in Biometrics (1990), 46, pp. 977-990.


Recently a great deal of attention has been given to binary regression models for clustered or correlated observations. The data of interest are of the form of a binary dependent or response variable, together with independent variables, where sets of observations are grouped together into clusters. A number of models and methods of analysis have been suggested to study such data. Many of these are extensions in some way of the familiar logistic regression model for binary data which are not grouped (i.e., each cluster is of size one). In general, the analyses of these clustered data models proceed by assuming that the observed clusters are a simple random sample of clusters selected from a population of clusters. In this paper, we consider the application of these procedures to the case where the clusters are selected randomly in a manner which depends on the pattern of responses in the cluster. For example, we show that ignoring the retrospective nature of the sample design, by fitting standard logistic regression models for clustered binary data, may result in misleading estimates of the effects of covariates and the precision of estimated regression coefficients.


Categorical Data Analysis | Epidemiology | Statistical Models

This document is currently not available here.