We present a second order estimator of the mean of a variable subject to missingness, under the missing at random assumption. The estimator improves upon existing methods by using an approximate second order expansion of the parameter functional, in addition to the first order expansion employed by standard doubly robust methods. This results in weaker assumptions about the convergence rates necessary to establish consistency, local efficiency, and asymptotic linearity. The general estimation strategy is developed under the targeted minimum loss based estimation (TMLE) framework. We present a simulation comparing the sensitivity of the first and second order estimators to the convergence rate of the initial estimators of the outcome regression and missingness score. In our simulation, the second order TMLE improved the coverage probability of a confidence interval by up to 53% for slow convergence rates of the initial estimators. In addition, we present a first order estimator inspired by a second order expansion of the parameter functional. This estimator only requires one-dimensional smoothing, whereas implementation of the second order TMLE generally requires kernel smoothing on the covariate space. The first order estimator proposed is expected to have improved finite sample performance, compared to existing first order estimators. In our simulations, the proposed first order estimator improved the coverage probability by up to 68% for slow convergence rates of the initial estimators of the outcome regression and the missingness score. We provide an illustration of our methods using a publicly available dataset targeting the effect of an anticoagulant on health outcomes of patients undergoing percutaneous coronary intervention. In our example, the use of an estimator with expected improved properties changes dramatically the substantive conclusions of the study. We provide R code to implement the proposed estimator.



Included in

Biostatistics Commons