Abstract
Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genome-wide association studies (GWAS). Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. In the context of survival analysis with high-dimensional covariates, this paper develops a computationally feasible method for building general risk prediction models, while controlling false discoveries. We have proposed a high-dimensional variable selection method by incorporating stability selection to control false discovery. Comparisons between the proposed method and the commonly used univariate and Lasso approaches for variable selection reveal that the proposed method yields fewer false discoveries. The proposed method is applied to study the associations of 2,339 common single-nucleotide polymorphisms (SNPs) with overall survival among cutaneous melanoma (CM) patients. The results have confirmed that BRCA2 pathway SNPs are likely to be associated with overall survival, as reported by previous literature. Moreover, we have identified several new Fanconi anemia (FA) pathway SNPs that are likely to modulate survival of CM patients.
Disciplines
Biostatistics
Suggested Citation
He, Kevin; Li, Yanming; Zhu, Ji; Liu, Hongliang; Lee, Jeffrey E.; Amos, Christopher I.; Hyslop, Terry; Jin, Jiashun; Wei, Qinyi; and Li, Yi, "Variable Selection with False Discovery Control" (January 2015). The University of Michigan Department of Biostatistics Working Paper Series. Working Paper 114.
https://biostats.bepress.com/umichbiostat/paper114