Targeted maximum likelihood estimation of a parameter of a data generating distribution, known to be an element of a semiparametric model, involves constructing a parametric model through an initial density estimator with parameter epsilon representing an amount of fluctuation of the initial density estimator, where the score of this fluctuation model at epsilon=0 equals the efficient influence curve/canonical gradient. The latter constraint can be satisfied by many parametric fluctuation models, since it represents only a local constraint of its behavior at zero fluctuation. However, it is very important that the fluctuations stay within the semiparametric model for the observed data distribution, even if the parameter can be defined on fluctuations that fall outside the assumed observed data model. In particular, in the context of sparse data, a violation of this property can heavily affect the performance of the estimator. We demonstrate this in the context of estimation of a causal effect of a binary treatment on a continuous outcome that is bounded. It results in a targeted maximum likelihood estimator that inherently respects known bounds, and consequently is more robust in sparse data situations than the targeted MLE using a naive fluctuation model.



Included in

Biostatistics Commons