"On Comparing the Clustering of Regression Models Method with K-means C" by Li-Xuan Qin and Steven G. Self

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Title

On Comparing the Clustering of Regression Models Method with K-means Clustering

Authors

Li-Xuan Qin, Memorial Sloan-Kettering Cancer CenterFollow
Steven G. Self, Fred Hutchinson Cancer Research CenterFollow

Abstract

Gene clustering is a common question addressed with microarray data. Previous methods, such as K-means clustering and hierarchical clustering, base gene clustering directly on the observed measurements. A new model-based clustering method, the clustering of regression models (CORM) method, bases the clustering of genes on their relationship to covariates. It explicitly models different sources of variations and bases gene clustering solely on the systematic variation. Both being partitional clustering, CORM is closely related to K-means clustering. In this paper, we discuss the relationship between the two clustering methods in terms of both model formulation and implications on other important aspects of cluster analysis. We show that the two methods can both be considered as solutions to a least squares problem with missing data but they each concern a different type of least squares. We also show that CORM tends to provide stable clusters across samples and is particularly useful if the cluster averages are used as predictors for sample classification. Finally we illustrate the application of CORM to a set of time course data measured on four yeast samples, which has a complicated experimental design and is difficult for K-means to handle.

Disciplines

Microarrays

Suggested Citation

Qin, Li-Xuan and Self, Steven G., "On Comparing the Clustering of Regression Models Method with K-means Clustering" (March 2007). Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series. Working Paper 14.
https://biostats.bepress.com/mskccbiostat/paper14

Supplementary Materials.pdf (573 kB)

Download

Included in

Microarrays Commons

COinS

Collection of Biostatistics Research Archive

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

MSKCC Biostatistics

Collection of Biostatistics Research Archive

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner

MSKCC Biostatistics