"Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Da" by Merrill D. Birkner, Alan E. Hubbard et al.

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data

Authors

Merrill D. Birkner, Division of Biostatistics, School of Public Health, University of California, BerkeleyFollow
Alan E. Hubbard, Division of Biostatistics, School of Public Health, University of California, BerkeleyFollow
Mark J. van der Laan, Division of Biostatistics, School of Public Health, University of California, BerkeleyFollow
Christine F. Skibola, Division of Environmental Health Sciences, School of Public Health, University of California, BerkeleyFollow
Christine M. Hegedus, Division of Environmental Health Sciences, School of Public Health, University of California, BerkeleyFollow
Martyn T. Smith, Division of Environmental Health Sciences, School of Public Health, University of California, BerkeleyFollow

Comments

Published 2006 in Statistical Applications in Genetics and Molecular Biology 5, article 11.

Abstract

A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of "interesting" proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth selection; 3) peak finding; and 4) methods to correct for multiple testing (van der Laan et al. (2005)). The result is a list of proteins (indexed by m/z) where average expression is significantly different among disease (or treatment, etc.) groups. The procedures are intended to provide a sensible and statistically driven algorithm, which we argue provides a list of proteins that have a significant difference in expression. Given no sources of unmeasured bias (such as confounding of experimental conditions with disease status), proteins found to be statistically significant using this technique have a low probability of being false positives.

Disciplines

Biostatistics | Statistical Methodology | Statistical Theory

Suggested Citation

Birkner, Merrill D.; Hubbard, Alan E.; van der Laan, Mark J.; Skibola, Christine F.; Hegedus, Christine M.; and Smith, Martyn T., "Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data" (December 2005). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 200.
https://biostats.bepress.com/ucbbiostat/paper200

Download

Included in

Biostatistics Commons, Statistical Methodology Commons, Statistical Theory Commons

COinS

Collection of Biostatistics Research Archive

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Authors

Comments

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

UCB Biostatistics

Collection of Biostatistics Research Archive

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Authors

Comments

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner

UCB Biostatistics