"Optimal Sample Size for Multiple Testing: the Case of Gene Expression " by Peter Muller, Giovanni Parmigiani et al.

Johns Hopkins University, Dept. of Biostatistics Working Papers

Title

Optimal Sample Size for Multiple Testing: the Case of Gene Expression Microarrays

Authors

Peter Muller, University of Texas M.D. Anderson Cancer Center
Giovanni Parmigiani, The Sydney Kimmel Comprehensive Cancer Center, Johns Hopkins UniversityFollow
Christian Robert, Ceremade, Universite Paris
Judith Rousseau, Universite Rene Descartes

Abstract

We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about differential gene expression. However, the approach is valid in any application that involves multiple comparisons in a large number of hypothesis tests. We discuss two decision problems in the context of this setup: the sample size selection and the decision about the multiple comparisons. We adopt a decision theoretic approach,using loss functions that combine the competing goals of discovering as many ifferentially expressed genes as possible, while keeping the number of false discoveries manageable. For consistency, we use the same loss function for both decisions. The decision rule that emerges for the multiple comparison problem takes the exact form of the rules proposed in the recent literature to control the posterior expected false discovery rate (FDR). For the sample size selection, we combine the expected utility argument with an additional sensitivity analysis, reporting the conditional expected utilities, and conditioning on assumed levels of the true differential expression. We recognize the resulting diagnostic as a form of statistical power, facilitating interpretation and communication. As a sampling model for observed gene expression densities across genes and arrays, we use a variation of a hierarchical Gamma/Gamma model. But the discussion of the decision problem is independent of the chosen probability model. The approach is valid for any model that includes positive prior probabilities for the null hypotheses in the multiple comparisons, and that allows for efficient marginal and posterior simulation, possibly by dependent Markov chain Monte Carlo simulation.

Disciplines

Design of Experiments and Sample Surveys | Genetics | Microarrays | Statistical Methodology | Statistical Theory

Suggested Citation

Muller, Peter; Parmigiani, Giovanni; Robert, Christian; and Rousseau, Judith, "Optimal Sample Size for Multiple Testing: the Case of Gene Expression Microarrays" (February 2004). Johns Hopkins University, Dept. of Biostatistics Working Papers. Working Paper 31.
https://biostats.bepress.com/jhubiostat/paper31

Previous Versions

August 29, 2003

Download

Included in

Design of Experiments and Sample Surveys Commons, Genetics Commons, Microarrays Commons, Statistical Methodology Commons, Statistical Theory Commons

COinS

Collection of Biostatistics Research Archive

Johns Hopkins University, Dept. of Biostatistics Working Papers

Title

Authors

Abstract

Disciplines

Suggested Citation

Previous Versions

Included in

Browse

Search

Author Corner

JHU Biostatistics

Collection of Biostatistics Research Archive

Johns Hopkins University, Dept. of Biostatistics Working Papers

Title

Authors

Abstract

Disciplines

Suggested Citation

Previous Versions

Included in

Share

Browse

Search

Author Corner

JHU Biostatistics