"Identification of biologically relevant subtypes via preweighted spars" by Sheila Gaynor and Eric Bair

The University of North Carolina at Chapel Hill Department of Biostatistics Technical Report Series

Title

Identification of biologically relevant subtypes via preweighted sparse clustering

Authors

Sheila Gaynor, University of North Carolina at Chapel HillFollow
Eric Bair, University of North Carolina at Chapel HillFollow

Abstract

Cluster analysis methods are used to identify homogeneous subgroups in a data set. Frequently one applies cluster analysis in order to identify biologically interesting subgroups. In particular, one may wish to identify subgroups that are associated with a particular outcome of interest. Conventional clustering methods often fail to identify such subgroups, particularly when there are a large number of high-variance features in the data set. Conventional methods may identify clusters associated with these high-variance features when one wishes to obtain secondary clusters that are more interesting biologically or more strongly associated with a particular outcome of interest. We describe a modification of the sparse clustering method of Witten and Tibshirani (2010) can be used to identify such secondary clusters or clusters associated with an outcome of interest. We show that this method can correctly identify such clusters of interest in several simulation scenarios. The method is also applied to a large case-control study of temporomandibular disorder and a breast cancer microarray data set.

Disciplines

Biostatistics | Microarrays | Statistical Methodology

Suggested Citation

Gaynor, Sheila and Bair, Eric, "Identification of biologically relevant subtypes via preweighted sparse clustering" (December 2012). The University of North Carolina at Chapel Hill Department of Biostatistics Technical Report Series. Working Paper 32.
http://biostats.bepress.com/uncbiostat/art32

Download

Included in

Biostatistics Commons, Microarrays Commons, Statistical Methodology Commons

COinS

Collection of Biostatistics Research Archive

The University of North Carolina at Chapel Hill Department of Biostatistics Technical Report Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Downloads

Included in

Browse

Search

Author Corner

UNC Biostatistics

Collection of Biostatistics Research Archive

The University of North Carolina at Chapel Hill Department of Biostatistics Technical Report Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Downloads

Included in

Share

Browse

Search

Author Corner

UNC Biostatistics