Abstract
The past two decades have witnessed significant advances in high-throughput ``omics" technologies such as genomics, proteomics, metabolomics, transcriptomics and radiomics. These technologies have enabled simultaneous measurement of the expression levels of tens of thousands of features from individual patient samples and have generated enormous amounts of data that require analysis and interpretation. One specific area of interest has been in studying the relationship between these features and patient outcomes, such as overall and recurrence-free survival, with the goal of developing a predictive ``omics" profile. Large-scale studies often suffer from the presence of a large fraction of censored observations and potential time-varying effects of features, and methods for handling them have been lacking. In this paper, we propose supervised methods for feature selection and survival prediction that simultaneously deal with both issues. Our approach utilizes continuum power regression (CPR) - a framework that includes a variety of regression methods - in conjunction with the parametric or semi-parametric accelerated failure time (AFT) model. Both CPR and AFT fall within the linear models framework and, unlike black-box models, the proposed prognostic index has a simple yet useful interpretation. We demonstrate the utility of our methods using simulated and publicly available cancer genomics data.
Disciplines
Biochemical Phenomena, Metabolism, and Nutrition | Biochemistry | Bioinformatics | Biological Phenomena, Cell Phenomena, and Immunity | Biology | Biotechnology | Cancer Biology | Computational Biology | Genetic Processes | Genetics | Genetics and Genomics | Genetic Structures | Genomics | Integrative Biology | Life Sciences | Medical Biomathematics and Biometrics | Medical Biotechnology | Medical Genetics | Medical Molecular Biology | Medical Pathology | Medicine and Health Sciences | Molecular Genetics | Other Genetics and Genomics | Physiological Processes
Suggested Citation
Spirko-Burns, Lauren and Devarajan, Karthik, "Supervised Dimension Reduction for Large-scale "Omics" Data with Censored Survival Outcomes Under Possible Non-proportional Hazards" (March 2019). COBRA Preprint Series. Working Paper 119.
https://biostats.bepress.com/cobra/art119
Included in
Biochemical Phenomena, Metabolism, and Nutrition Commons, Biochemistry Commons, Bioinformatics Commons, Biological Phenomena, Cell Phenomena, and Immunity Commons, Biotechnology Commons, Cancer Biology Commons, Computational Biology Commons, Genetic Processes Commons, Genetics Commons, Genetic Structures Commons, Genomics Commons, Integrative Biology Commons, Medical Biomathematics and Biometrics Commons, Medical Biotechnology Commons, Medical Genetics Commons, Medical Molecular Biology Commons, Medical Pathology Commons, Molecular Genetics Commons, Other Genetics and Genomics Commons, Physiological Processes Commons