A Multiple Testing Approach to High-dimensional Association Studies with an Application to the Detection of Associations Between Risk Factors of Heart Disease and Genetic Polymorphisms

José A. Ferreira, Department of Epidemiology and Biostatistics, VUMC, The Netherlands
Johannes Berkhof, Department of Epidemiology and Biostatistics, VUMC, The Netherlands
Olga Souverein, Division of Human Nutrition, Wageningen University
Koos Zwinderman, Department of Clinical Epidemiology and Biostatistics, AMC, Amsterdam

Abstract

We present an approach to association studies involving a dozen or so 'response' variables and a few hundred 'explanatory' variables which emphasizes transparency, simplicity, and protection against spurious results. The methods proposed are largely non-parametric, and they are systematically rounded-off by the Benjamini-Hochberg method of multiple testing. An application to the detection of associations between risk factors of heart disease and genetic polymorphisms using the REGRESS dataset provides ample illustration of our approach. Special attention is paid to book-keeping and information-management aspects of data analysis, which, unglamorous as they are, allow the creation of an informative and reasonably digestible 'map of relationships'---the end-product of an association study as far as Statistics is concerned.