Abstract

The past two decades have witnessed significant advances in high-throughput ``omics" technologies such as genomics, proteomics, metabolomics, transcriptomics and radiomics. These technologies have enabled simultaneous measurement of the expression levels of tens of thousands of features from individual patient samples and have generated enormous amounts of data that require analysis and interpretation. One specific area of interest has been in studying the relationship between these features and patient outcomes, such as overall and recurrence-free survival, with the goal of developing a predictive ``omics" profile. Large-scale studies often suffer from the presence of a large fraction of censored observations and potential time-varying effects of features, and methods for handling them have been lacking. In this paper, we propose supervised methods for feature selection and survival prediction that simultaneously deal with both issues. Our approach utilizes continuum power regression (CPR) - a framework that includes a variety of regression methods - in conjunction with the parametric or semi-parametric accelerated failure time (AFT) model. Both CPR and AFT fall within the linear models framework and, unlike black-box models, the proposed prognostic index has a simple yet useful interpretation. We demonstrate the utility of our methods using simulated and publicly available cancer genomics data.

Disciplines

Biochemical Phenomena, Metabolism, and Nutrition | Biochemistry | Bioinformatics | Biological Phenomena, Cell Phenomena, and Immunity | Biology | Biotechnology | Cancer Biology | Computational Biology | Genetic Processes | Genetics | Genetics and Genomics | Genetic Structures | Genomics | Integrative Biology | Life Sciences | Medical Biomathematics and Biometrics | Medical Biotechnology | Medical Genetics | Medical Molecular Biology | Medical Pathology | Medicine and Health Sciences | Molecular Genetics | Other Genetics and Genomics | Physiological Processes

Share

COinS