"Second Order Inference for the Mean of a Variable Missing at Random" by Ivan Diaz, Marco Carone et al.

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Second Order Inference for the Mean of a Variable Missing at Random

Authors

Ivan Diaz, Johns Hopkins UniversityFollow
Marco Carone, University of WashingtonFollow
Mark J. van der Laan, University of California, BerkeleyFollow

Abstract

We present a second order estimator of the mean of a variable subject to missingness, under the missing at random assumption. The estimator improves upon existing methods by using an approximate second order expansion of the parameter functional, in addition to the first order expansion employed by standard doubly robust methods. This results in weaker assumptions about the convergence rates necessary to establish consistency, local efficiency, and asymptotic linearity. The general estimation strategy is developed under the targeted minimum loss based estimation (TMLE) framework. We present a simulation comparing the sensitivity of the first and second order estimators to the convergence rate of the initial estimators of the outcome regression and missingness score. In our simulation, the second order TMLE improved the coverage probability of a confidence interval by up to 53% for slow convergence rates of the initial estimators. In addition, we present a first order estimator inspired by a second order expansion of the parameter functional. This estimator only requires one-dimensional smoothing, whereas implementation of the second order TMLE generally requires kernel smoothing on the covariate space. The first order estimator proposed is expected to have improved finite sample performance, compared to existing first order estimators. In our simulations, the proposed first order estimator improved the coverage probability by up to 68% for slow convergence rates of the initial estimators of the outcome regression and the missingness score. We provide an illustration of our methods using a publicly available dataset targeting the effect of an anticoagulant on health outcomes of patients undergoing percutaneous coronary intervention. In our example, the use of an estimator with expected improved properties changes dramatically the substantive conclusions of the study. We provide R code to implement the proposed estimator.

Disciplines

Biostatistics

Suggested Citation

Diaz, Ivan; Carone, Marco; and van der Laan, Mark J., "Second Order Inference for the Mean of a Variable Missing at Random" (May 2015). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 337.
https://biostats.bepress.com/ucbbiostat/paper337

Download

Included in

Biostatistics Commons

COinS

Collection of Biostatistics Research Archive

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

UCB Biostatistics

Collection of Biostatistics Research Archive

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner

UCB Biostatistics