We introduce combinatorial mixtures - a flexible class of models for inference on mixture distributions whose component have multidimensional parameters. The key idea is to allow each element of the component-specific parameter vectors to be shared by a subset of other components. This approach allows for mixtures that range from very flexible to very parsimonious, and unifies inference on component-specific parameters with inference on the number of components. We develop Bayesian inference and computation approaches for this class of distributions, and illustrate them in an application. This work was originally motivated by the analysis of cancer subtypes: in terms of biological measures of interest, subtypes may be characterized by differences in location, scale, correlations or any of the combinations. We illustrate our approach using data on molecular subtypes of lung cancer.


Statistical Methodology | Statistical Theory