The purpose of cancer genome sequencing studies is to determine the nature and types of alterations present in a typical cancer and to discover genes mutated at high frequencies. In this article we discuss statistical methods for the analysis of data generated in these studies. We place special emphasis on a two-stage study design introduced by Sjoblom et al.[1]. In this context, we describe statistical methods for constructing scores that can be used to prioritize candidate genes for further investigation and to assess the statistical signicance of the candidates thus identfied.


Bioinformatics | Computational Biology