Bayesian classification of tumors using gene expression data

Bani K. Mallick, Texas A & M Univeristy
Debashis Ghosh, University of Michigan
Malay Ghosh, University of Florida


Precise classification of tumors is critical for cancer diagnosis and treatment. In recent years, there has been a move towards the use of cDNA microarrays for tumor classification. These high-throughput assays provide relative mRNA expression measurements simultaneously for thousands of genes. A key statistical task is to perform classification via different expression patterns. This paper considers several Bayesian classification methods for the analysis of microarray data based on reproducing kernel Hilbert spaces. We consider the logistic likelihood as well as likelihoods related to the Support Vector Machine (SVM) models. It is shown through simulation and examples that SVM models with multiple shrinkage parameters produce fewer misclassification errors than several existing classical methods as well as Bayesian methods based on the logistic likelihood or those involving only one shrinkage parameter.