This material was published as: S. Rose, M.J. van der Laan (2008). "Simple Optimal Weighting of Cases and Controls in Case-Control Studies," Int J Biostat: 4(1): Article 19. It was later adapted and also published in: S. Rose, M.J. van der Laan (2011). "Independent Case-Control Studies." In M.J. van der Laan and S. Rose, Targeted Learning: Causal Inference for Observational and Experimental Data, Chapter 13. New York, Springer.


Researchers of uncommon diseases are often interested in assessing potential risk factors. Given the low incidence of disease, these studies are frequently case-control in design, as this allows for a sufficient number of cases to be obtained without extensive sampling and can increase efficiency. However, these case-control samples are then biased since the proportion of cases in the sample is not the same as the population of interest. Methods for analyzing case-control studies have focused on utilizing logistic regression models that provide conditional and not causal estimates of the odds ratio. This article will demonstrate the use of the prevalence probability and case-control weighted targeted maximum likelihood estimation (MLE), as described by van der Laan (2008), in order to obtain causal estimates of the parameters of interest (risk difference, relative risk, and odds ratio). It is meant to be used as a guide for researchers, with step-by-step directions to implement this methodology. We will also present simulation studies that show the improved efficiency of the case-control weighted targeted MLE compared to other techniques.



Included in

Biostatistics Commons