Abstract

Non-negative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two matrices, W and H, each with nonnegative entries, V ~ WH. NMF has been shown to have a unique parts-based, sparse representation of the data. The nonnegativity constraints in NMF allow only additive combinations of the data which enables it to learn parts that have distinct physical representations in reality. In the last few years, NMF has been successfully applied in a variety of areas such as natural language processing, information retrieval, image processing, speech recognition and computational biology for the analysis and interpretation of large-scale data.

We present a generalized approach to NMF based on Renyi's divergence between two non-negative matrices related to the Poisson likelihood. Our approach unifies various competing models and provides a unique framework for NMF. Furthermore, we generalize the equivalence between NMF and probabilistic latent semantic indexing, a well-known method used in text mining and document clustering applications. We evaluate the performance of our method in the unsupervised setting using consensus clustering and demonstrate its applicability using real-life and simulated data.

Disciplines

Suggested Citation

Devarajan, Karthik; Wang, Guoli; and Ebrahimi, Nader, "A unified approach to non-negative matrix factorization and probabilistic latent semantic indexing" (July 2011). COBRA Preprint Series. Working Paper 80.
https://biostats.bepress.com/cobra/art80

Download

Included in

Bioinformatics Commons, Categorical Data Analysis Commons, Computational Biology Commons, Multivariate Analysis Commons, Statistical Methodology Commons, Statistical Models Commons, Statistical Theory Commons

COinS

Collection of Biostatistics Research Archive

COBRA Preprint Series

A unified approach to non-negative matrix factorization and probabilistic latent semantic indexing

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

Collection of Biostatistics Research Archive

COBRA Preprint Series

A unified approach to non-negative matrix factorization and probabilistic latent semantic indexing

Authors

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner