A Multiple Testing Approach to High-dimensional Association Studies with an Application to the Detection of Associations Between Risk Factors of Heart Disease and Genetic Polymorphisms
Abstract
We present an approach to association studies involving a dozen or so 'response' variables and a few hundred 'explanatory' variables which emphasizes transparency, simplicity, and protection against spurious results. The methods proposed are largely non-parametric, and they are systematically rounded-off by the Benjamini-Hochberg method of multiple testing. An application to the detection of associations between risk factors of heart disease and genetic polymorphisms using the REGRESS dataset provides ample illustration of our approach. Special attention is paid to book-keeping and information-management aspects of data analysis, which, unglamorous as they are, allow the creation of an informative and reasonably digestible 'map of relationships'---the end-product of an association study as far as Statistics is concerned.