Published January 2006 in Computational Stat and Data Analysis, 50(20); 475-498.


Estimators for the parameter of interest in semiparametric models often depend on a guessed model for the nuisance parameter. The choice of the model for the nuisance parameter can affect both the finite sample bias and efficiency of the resulting estimator of the parameter of interest. In this paper we propose a finite sample criterion based on cross validation that can be used to select a nuisance parameter model from a list of candidate models. We show that expected value of this criterion is minimized by the nuisance parameter model that yields the estimator of the parameter of interest with the smallest mean-squared error relative to the expected value of an initial consistent reference estimator. In a simulation study, we examine the performance of this criterion for selecting a model for a treatment mechanism in a marginal structural model (MSM) of point treatment data. For situations where all possible models cannot be evaluated, we outline a forward/backward model selection algorithm based on the cross validation criterion proposed in this paper and show how it can be used to select models for multiple nuisance parameters. We evaluate the performance of this algorithm in a simulation study of the one-step estimator of the parameter of interest in a MSM where models for both a treatment mechanism and a conditional expectation of the response need to be selected. Finally, we apply the forward model selection algorithm to a MSM analysis of the relationship between boiled water use and gastrointestinal illness in HIV positive men.


Statistical Methodology | Statistical Models | Statistical Theory | Survival Analysis