Standard prospective logistic regression analysis of case-control data often leads to very imprecise estimates of gene-environment interactions due to small numbers of cases or controls in cells of crossing genotype and exposure. In contrast, under the assumption of gene-environment independence, modern “retrospective” methods, including the “case-only” approach, can estimate the interaction parameters much more precisely, but they can be seriously biased when the underlying assumption of gene-environment independence is violated. In this article, we propose a novel approach to analyze case-control data that can relax the gene-environment independence assumption using an empirical Bayes framework. In the special case, involving a binary gene and a binary exposure, the framework leads to an estimator of the odds-ratio interaction parameter in a simple closed form that corresponds to an weighted average of the standard case-only and case-control estimators. We also describe a general approach for deriving the empirical Bayes estimator and its variance within the retrospective maximum-likelihood framework developed by Chatterjee and Carroll (2005). We conduct simulation studies to investigate the mean-squared-error of the proposed estimator in both fixed and random parameter settings. We also illustrate the application of this methodology using two real data examples. Both simulated and real data examples suggest that the proposed estimator strikes an excellent balance between bias and efficiency depending on the true nature of the gene-environment association and the sample size for a given study.



Included in

Biostatistics Commons