Estimation with Interval Censored Data and Covariates


Published in Lifetime Data Analysis 3: 77-91, 1997.


In biostatistical applications interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed monitoring time C, then the data conforms to the well understood singly-censored current status model, also known as interval censored data, case I. Additional covariates can be used to allow for dependent censoring and to improve estimation of the marginal distribution of T. Assuming a wrong model for the conditional distribution of T, given the covariates, will lead to an inconsistent estimator of the marginal distribution. On the other hand, the nonparametric maximum likelihood estimator of FT requires splitting up the sample in several subsamples corresponding with a particular value of the covariates, computing the NPMLE for every subsample and then taking an average. With a few continuous covariates the performance of the resulting estimator is typically miserable. In van der Laan, Robins (1996) a locally efficient one-step estimator is proposed for smooth functionals of the distribution of T, assuming nothing about the conditional distribution of T, given the covariates, but assuming a model for censoring, given the covariates. The estimators are asymptotically linear if the censoring mechanism is estimated correctly. The estimator also uses an estimator of the conditional distribution of T, given the covariates. If this estimate is consistent, then the estimator is efficient and if it is inconsistent, then the estimator is still consistent and asymptotically normal. In this paper we show that the estimators can also be used to estimate the distribution function in a locally optimal way. Moreover, we show that the proposed estimator can be used to estimate the distribution based on interval censored data (T is now known to lie between two observed points) in the presence of covariates. The resulting estimator also has a known influence curve so that asymptotic confidence intervals are directly available. In particular, one can apply our proposal to the interval censored data without covariates. In Geskus (1992) the information bound for interval censored data with two uniformly distributed monitoring times at the uniform distribution (for T) has been computed. We show that the relative efficiency of our proposal w.r.t. this optimal bound equals 0.994, which is also reflected in finite sample simulations. Finally, the good practical performance of the estimator is shown in a simulation study.


Statistical Methodology | Statistical Theory

This document is currently not available here.