Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence---that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well-separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate the model is misspecified. Body mass index data from a national longitudinal study is used to demonstrate the effects of misspecification on potential inferences made in practice.



Included in

Biostatistics Commons