The linear mixed effects model (LMM) is widely used in the analysis of clustered or longitudinal data. In the practice of LMM, the inference on the structure of the random effects component is of great importance, not only to yield proper interpretation of subject-specific effects but also to draw valid statistical conclusions. This task of inference becomes significantly challenging when a large number of fixed effects and random effects are involved in the analysis. The difficulty of variable selection arises from the need of simultaneously regularizing both mean model and covariance structures, with possible parameter constraints between the two. In this paper, we propose a novel method of doubly regularized restricted maximum likelihood to select fixed and random effects simultaneously in the LMM. The Cholesky decomposition is invoked to ensure the positive-definiteness of the selected covariance matrix of random effects, and selected random effects are invariant with respect to the ordering of predictors appearing in the Cholesky decomposition. We then develop a new algorithm that solves the related optimization problem effectively, in which the computational cost is comparable with that of the Newton-Raphson algorithm for MLE or REML in the LMM. We also investigate large sample properties for the proposed method, including the oracle property. Both simulation studies and data analysis are included for illustration.



Included in

Biostatistics Commons