U.C. Berkeley Division of Biostatistics Working Paper Series

Evaluation of Progress Towards the UNAIDS 90-90-90 HIV Care Cascade: A Description of Statistical Methods Used in an Interim Analysis of the Intervention Communities in the SEARCH Study

Laura Balzer et al. — Tue, 21 Feb 2017 12:34:58 PST

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral load measured may not be representative of those for whom no viral load is obtained. We provide an in-depth description of the statistical methods (target parameters, assumptions, statistical estimands, and algorithms) used in an interim analysis of the intervention arm of the SEARCH Study (NCT01864603) to analyze progress towards the UNAIDS 90-90-90 target at study baseline and after one and two years. We describe the methods used to account for informative measurement in all analyses as well as for informative censoring in longitudinal analyses. We use targeted maximum likelihood estimation (TMLE) with Super Learning to generate semi-parametric efficient and double robust estimates of the care cascade among a open cohort of prevalent HIV-positive adults and among a closed cohort of baseline HIV-positive adults. TMLE is also used to evaluate predictors of poor outcomes.

Doubly-robust Nonparametric Inference on the Average Treatment Effect

David Benkeser et al. — Tue, 18 Oct 2016 16:21:35 PDT

Doubly-robust estimators are widely used to draw inference about the average effect of a treatment. Such estimators are consistent for the effect of interest if either one of two nuisance parameters is consistently estimated. However, if flexible, data-adaptive estimators of these nuisance parameters are used, double-robustness does not readily extend to inference. We present a general theoretical study of the behavior of doubly-robust estimators of an average treatment effect when one of the nuisance parameters is inconsistently estimated. We contrast different approaches for constructing such estimators and investigate the extent to which they may be modified to also allow doubly-robust inference. We find that while targeted maximum likelihood estimation can be used to solve this problem very naturally, common alternative frameworks appear to be inappropriate for this purpose. We provide a theoretical study and a numerical evaluation of the alternatives considered. Our simulations highlight the need and usefulness of these approaches in practice, while our theoretical developments have broad implications for the construction of estimators that permit doubly-robust inference in other problems.

Online Cross-Validation-Based Ensemble Learning

David Benkeser et al. — Tue, 18 Oct 2016 16:11:35 PDT

Online estimators update a current estimate with a new incoming batch of data without having to revisit past data thereby providing streaming estimates that are scalable to big data. We develop flexible, ensemble-based online estimators of an infinite-dimensional target parameter, such as a regression function, in the setting where data are generated sequentially by a common conditional data distribution given summary measures of the past. This setting encompasses a wide range of time-series models and as special case, models for independent and identically distributed data. Our estimator considers a large library of candidate online estimators and uses online cross-validation to identify the algorithm with the best performance. We show that by basing estimates on the cross-validation-selected algorithm, we are asymptotically guaranteed to perform as well as the true, unknown best-performing algorithm. We provide extensions of this approach including online estimation of the optimal ensemble of candidate online estimators. We illustrate the practical performance of our methods using simulations and a real data example where we make streaming predictions of infectious disease incidence using data from a large database.

Performance-constrained Binary Classification Using Ensemble Learning: an Application to Cost-efficient Targeted PrEP Strategies

Wenjing Zheng et al. — Tue, 04 Oct 2016 14:44:06 PDT

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the constraint are both monotonic with respect to a threshold function. These include the minimization of the Rate of Positive Predictions subject to a lower bound on the sensitivity, and vice versa, and the Neyman-Pearson paradigm, which minimizes the type II error subject to an upper bound on the type I error. We propose an ensemble approach to these binary classification problems based on the Super Learner algorithm, characterized by weights combining the constituent risk prediction algorithms and a discriminating risk threshold for classification that aim to minimize the given constrained optimality criterion. We then illustrate the application of the proposed classifier to develop an individual PrEP targeting strategy in a resource-limited setting, with the goal of minimizing the number of PrEP offerings while achieving a minimum required sensitivity. This proof of concept data analysis uses baseline data from the ongoing Sustainable East Africa Research in Community Health study.

Practical Targeted Learning from Large Data Sets by Survey Sampling

Patrice Bertail et al. — Thu, 07 Jul 2016 16:08:33 PDT

We address the practical construction of asymptotic confidence intervals for smooth (i.e., pathwise differentiable), real-valued statistical
parameters by targeted learning from independent and identically
distributed data in contexts where sample size is so large that it poses
computational challenges. We observe some summary measure of all data and select a sub-sample from the complete data set by Poisson rejective sampling with unequal inclusion probabilities based on the summary measures. Targeted learning is carried out from the easier to handle sub-sample. We derive a central limit theorem for the targeted minimum loss estimator (TMLE) which enables the construction of the confidence intervals. The inclusion probabilities can be optimized to reduce the asymptotic variance of the TMLE. We illustrate the procedure with two examples where the parameters of interest are variable importance measures of an exposure (binary or continuous) on an outcome. We also conduct a simulation study and comment on its results.

Scalable Collaborative Targeted Learning for High-dimensional Data

Cheng Ju et al. — Wed, 29 Jun 2016 11:04:20 PDT

Robust inference of a low-dimensional parameter in a large semi-parametric model relies on external estimators of infinite-dimensional features of the distribution of the data. Typically, only one of the latter is optimized for the sake of constructing a well behaved estimator of the low-dimensional parameter of interest. Optimizing more than one of them for the sake of achieving a better bias-variance trade-off in the estimation of the parameter of interest is the core idea driving the C-TMLE procedure.

The original C-TMLE procedure can be presented as a greedy forward stepwise algorithm. It does not scale well when the number $p$ of covariates increases drastically. This motivates the introduction of a novel template of C-TMLE procedure where the covariates are pre-ordered. Its time complexity is $\mathcal{O}(p)$ as opposed to the original $\mathcal{O}(p^2)$, a remarkable gain. We propose two pre-ordering strategies and suggest a rule of thumb to develop other meaningful strategies. Because it is usually unclear a priori which pre-ordering strategy to choose, we also introduce a SL-C-TMLE procedure that enables the data-driven choice of the better pre-ordering strategy given the problem at hand. Its time complexity is $\mathcal{O}(p)$ as well.

A Julia software makes it easy to implement our variants of C-TMLE procedures. We use the software to assess their computational burdens in different scenarios; to compare their performances in simulation studies involving fully synthetic data or partially synthetic data based on a real, large electronic health database; and to showcase their application to the analyses of three real, large electronic health databases. In all analyses involving electronic health databases, the vanilla C-TMLE procedure is unacceptably slow. Judging from the simulation studies, our pre-ordering strategies work well, and so does the SL-C-TMLE procedure.

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods

Cheng Ju et al. — Tue, 28 Jun 2016 16:12:13 PDT

The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a "library" of candidate prediction models. The SL is not restricted to a single prediction model, but uses the strengths of a variety of learning algorithms to adapt to different databases. While the SL has been shown to perform well in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of the SL in its ability to predict treatment assignment using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also considered a novel strategy for prediction modeling that combines the SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases.

TMLE for Marginal Structural Models Based on an Instrument

Boriska Toth et al. — Fri, 24 Jun 2016 09:42:20 PDT

We consider estimation of a causal effect of a possibly continuous treatment when treatment assignment is potentially subject to unmeasured confounding, but an instrumental variable is available. Our focus is on estimating heterogeneous treatment effects, so that the treatment effect can be a function of an arbitrary subset of the observed covariates. One setting where this framework is especially useful is with clinical outcomes. Allowing the causal dose-response curve to depend on a subset of the covariates, we define our parameter of interest to be the projection of the true dose-response curve onto a user-supplied working marginal structural model. We develop a targeted minimum loss-based estimator (TMLE) of this estimand. Our TMLE can be viewed as a generalization of the two-stage regression method in the instrumental variable methodology to a semiparametric model with minimal assumptions. The asymptotic efficiency and robustness of this substitution estimator is outlined. Through detailed simulations, we demonstrate that our estimator's finite-sample performance can beat other semiparametric estimators with similar asymptotic properties. In addition, our estimator can greatly outperform standard approaches. For instance, the use of data-adaptive learning to achieve a good fit can lead to both lower bias and lower variance than for an incorrectly specified parametric estimator. Finally, we apply our estimator to a real dataset to estimate the effect of parents' education on their infant's health.

Data-adaptive Inference of the Optimal Treatment Rule and its Mean Reward. The Masked Bandit

Antoine Chambaz et al. — Tue, 12 Apr 2016 13:45:08 PDT

This article studies the data-adaptive inference of an optimal treatment rule. A treatment rule is an individualized treatment strategy in which treatment assignment for a patient is based on her measured baseline covariates. Eventually, a reward is measured on the patient. We also infer the mean reward under the optimal treatment rule. We do so in the so called non-exceptional case, i.e., assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption.

Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal treatment rule. This data-adaptive statistical parameter is worthy of interest on its own. Our main result is a central limit theorem which enables the construction of confidence intervals on both mean rewards under the current estimate of the optimal treatment rule and under the optimal treatment rule itself. The asymptotic variance of the estimator takes the form of the variance of an efficient influence curve at a limiting distribution, allowing to discuss the efficiency of inference.

As a by product, we also derive confidence intervals on two cumulated pseudo-regrets, a key notion in the study of bandits problems. Seen as two additional data-adaptive statistical parameters, they compare the sum of the rewards actually received during the course of the experiment with, either the sum of the means of the rewards, or the counterfactual rewards we would have obtained if we had used from the start the current estimate of the optimal treatment rule to assign treatment.

A simulation study illustrates the procedure. One of the cornerstones of the theoretical study is a new maximal inequality for martingales with respect to the uniform entropy integral.

Marginal Structural Models with Counterfactual Effect Modifiers

Wenjing Zheng et al. — Wed, 30 Mar 2016 12:48:57 PDT

In health and social sciences, research questions often involve systematic assessment of the modification of treatment causal effect by patient characteristics, in longitudinal settings with time-varying or post-intervention effect modifiers of interest. In this work, we investigate the robust and efficient estimation of the so-called Counterfactual-History-Adjusted Marginal Structural Model (van der Laan and Petersen (2007)), which models the conditional intervention-specific mean outcome given modifier history in an ideal experiment where, possible contrary to fact, the subject was assigned the intervention of interest, including the treatment sequence in the conditioning history. We establish the semiparametric efficiency theory for these models, and present a substitution-based, semiparametric efficient and doubly robust estimator using the targeted maximum likelihood estimation methodology (TMLE, e.g. van der Laan and Rubin (2006), van der Laan and Rose (2011)). To facilitate implementation in applications where the effect modifier is high dimensional, our third contribution is a projected influence curve (and the corresponding TMLE estimator), which retains most of the robustness of its efficient peer and can be easily implemented in applications where the use of the efficient influence curve becomes taxing. In addition to these two robust estimators, we also present an Inverse-Probability-Weighted (IPW) estimator (e.g. Robins (1997a), Hernan, Brumback, and Robins (2000)), and a non-targeted G-computation estimator (Robins (1986)). The comparative performance of these estimators are assessed in a simulation study. The use of the TMLE estimator (based on the projected influence curve) is illustrated in a secondary data analysis for the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial.

One-Step Targeted Minimum Loss-based Estimation Based on Universal Least Favorable One-Dimensional Submodels

Mark J. van der Laan et al. — Wed, 30 Mar 2016 12:48:41 PDT

Evaluating the Impact of Treating the Optimal Subgroup

Alexander R. Luedtke et al. — Tue, 22 Mar 2016 12:34:09 PDT

Suppose we have a binary treatment used to influence an outcome. Given data from an observational or controlled study, we wish to determine whether or not there exists some subset of observed covariates in which the treatment is more effective than the standard practice of no treatment. Furthermore, we wish to quantify the improvement in population mean outcome that will be seen if this subgroup receives treatment and the rest of the population remains untreated. We show that this problem is surprisingly challenging given how often it is an (at least implicit) study objective. Blindly applying standard techniques fails to yield any apparent asymptotic results, while using existing techniques to confront the non-regularity does not necessarily help at distributions where there is no treatment effect. Here we describe an approach to estimate the impact of treating the subgroup which benefits from treatment that is valid in a nonparametric model and is able to deal with the case where there is no treatment effect. The approach is a slight modification of an approach that recently appeared in the individualized medicine literature.

Evaluating the Impact of a HIV Low-Risk Express Care Task-Shifting Program: A Case Study of the Targeted Learning Roadmap

Linh Tran et al. — Wed, 02 Mar 2016 15:32:12 PST

In conducting studies on an exposure of interest, a systematic roadmap should be applied for translating causal questions into statistical analyses and interpreting the results. In this paper we describe an application of one such roadmap applied to estimating the joint effect of both time to availability of a nurse-based triage system (low risk express care (LREC)) and individual enrollment in the program among HIV patients in East Africa. Our study population is comprised of 16;513 subjects found eligible for this task-shifting program within 15 clinics in Kenya between 2006 and 2009, with each clinic starting the LREC program between 2007 and 2008. After discretizing followup into 90-day time intervals, we targeted the population mean counterfactual outcome (i.e. counterfactual probability of either dying or being lost to follow up) at up to 450 days after initial LREC eligibility under three fixed treatment interventions. These were (i) under no program availability during the entire follow-up, (ii) under immediate program availability at initial eligibility, but non-enrollment during the entire follow-up, and (iii) under immediate program availability and enrollment at initial eligibility. We further estimated the controlled direct effect of immediate program availability compared to no program availability, under a hypothetical intervention to prevent individual
enrollment in the program. Targeted minimum loss-based estimation was used to estimate the mean outcome, while Super Learning was implemented to estimate the required nuisance parameters. Analyses were conducted with the ltmle R package; analysis code is available at an online repository as an R package. Results showed that at 450 days, the probability of in-care survival for subjects with immediate availability and enrollment was 0:93 (95% CI: 0.91, 0.95) and 0:87 (95% CI: 0.86, 0.87) for subjects with immediate availability never enrolling. For subjects without LREC availability, it was 0:91 (95% CI: 0.90, 0.92). Immediate program availability without individual
enrollment, compared to no program availability, was estimated to slightly albeit significantly decrease survival by 4% (95% CI 0.03,0.06, p< 0:01). Immediately availability and enrollment resulted in a 7% higher in-care survival compared to immediate availability with non-enrollment after 450 days (95% CI -0.08,-0.05, p< 0:01). The results are consistent with a fairly small impact of both availability and enrollment in the LREC program on in-care survival.

Semi-Parametric Estimation and Inference for the Mean Outcome of the Single Time-Point Intervention in a Causally Connected Population

Oleg Sofrygin et al. — Mon, 04 Jan 2016 11:39:02 PST

We study the framework for semi-parametric estimation and statistical inference for the sample average treatment-specific mean effects in observational settings where data are collected on a single network of connected units (e.g., in the presence of interference or spillover). Despite recent advances, many of the current statistical methods rely on estimation techniques that assume a particular parametric model for the outcome, even though some of the most important statistical assumptions required by these models are most likely violated in the observational network settings, often resulting in invalid and anti-conservative statistical inference. In this manuscript, we rely on the recent methodological advances for the targeted maximum likelihood estimation (TMLE) of causal effects in a network of causally connected units, to describe an estimation approach that permits for more realistic classes of data-generative models and provides valid statistical inference in the context of network-dependent data. The approach is applied to an observational setting with a single time point stochastic intervention. We start by assuming that the true observed data-generating distribution belongs to a large class of semi-parametric statistical models. We then impose some restrictions on the possible set of the data-generative distributions that may belong to our statistical model. For example, we assume that the dependence among units can be fully described by the known network, and that the dependence on other units can be summarized via some known (but otherwise arbitrary) summary measures. We show that under our modeling assumptions, our estimand is equivalent to an estimand in a hypothetical iid data distribution, where the latter distribution is a function of the observed network data-generating distribution. With this key insight in mind, we show that the TMLE for our estimand in dependent network data can be described as a certain iid data TMLE algorithm, also resulting in a new simplified approach to conducting statistical inference. We demonstrate the validity of our approach in a network simulation study. We also extend prior work on dependent-data TMLE towards estimation of novel causal parameters, e.g., the unit-specific direct treatment effects under interference and the effects of interventions that modify the initial network structure.

A Generally Efficient Targeted Minimum Loss Based Estimator

Mark J. van der Laan — Mon, 14 Dec 2015 09:45:34 PST

Suppose we observe n independent and identically distributed observations of a finite dimensional bounded random variable. This article is concerned with the construction of an efficient targeted minimum loss-based estimator (TMLE) of a pathwise differentiable target parameter based on a realistic statistical model.

The canonical gradient of the target parameter at a particular data distribution will depend on the data distribution through an infinite dimensional nuisance parameter which can be defined as the minimizer of the expectation of a loss function (e.g., log-likelihood loss). For many models and target parameters the nuisance parameter can be split up in two components, one required for evaluation of the target parameter and one real nuisance parameter. The only smoothness condition we will enforce on the statistical model is that these nuisance parameters are multivariate real valued cadlag functions and have a finite supremum and variation norm.

We propose a general one-step targeted minimum loss-based estimator (TMLE) based on an initial estimator of the nuisance parameters defined by a loss-based super-learner that uses cross-validation to combine a library of candidate estimators. We enforce this library to contain minimum loss based estimators minimizing the empirical risk over the parameter space under the additional constraint that the variation norm is bounded by a set constant, across a set of constants for which the maximal constant converges to infinity with sample size. We show that this super-learner is not only asymptotically equivalent with the best performing algorithm in the library, but also that it always converges to the true nuisance parameter values at a rate faster than $n^{-1/4}$. This minimal rate applies to each dimension of the data and even to nonparametric statistical models. We also demonstrate that the implementation of these constant-specific minimum loss-based estimators can be carried out by minimizing the empirical risk over linear combinations of basis functions under the constraint that the sum of the absolute value of the coefficients is smaller than the constant (e.g., Lasso regression), making our proposed estimators practically feasible.

Based on this rate of the super-learner of the nuisance parameter, we can establish that this one-step TMLE is asymptotically efficient at any data generating distribution in the model, under very weak structural conditions on the target parameter mapping and model. We demonstrate our general theorems by constructing such a one-step TMLE of the average causal effect in a nonparametric model, and presenting the corresponding efficiency theorem.

An Omnibus Nonparametric Test of Equality in Distribution for Unknown Functions

Alexander R. Luedtke et al. — Fri, 16 Oct 2015 10:15:49 PDT

We present a novel family of nonparametric omnibus tests of the hypothesis that two unknown but estimable functions are equal in distribution when applied to the observed data structure. We developed these tests, which represent a generalization of the maximum mean discrepancy tests described in Gretton et al. [2006], using recent developments from the higher-order pathwise differentiability literature. Despite their complex derivation, the associated test statistics can be expressed rather simply as U-statistics. We study the asymptotic behavior of the proposed tests under the null hypothesis and under both fixed and local alternatives. We provide examples to which our tests can be applied and show that they perform well in a simulation study. As an important special case, our proposed tests can be used to determine whether an unknown function, such as the conditional average treatment effect, is equal to zero almost surely.

The Statistics of Sensitivity Analyses

Alexander R. Luedtke et al. — Tue, 06 Oct 2015 15:29:44 PDT

Suppose one wishes to estimate a causal parameter given a sample of observations. This requires making unidentifiable assumptions about an underlying causal mechanism. Sensitivity analyses help investigators understand what impact violations of these assumptions could have on the causal conclusions drawn from a study, though themselves rely on untestable (but hopefully more interpretable) assumptions. Díaz and van der Laan (2013) advocate the use of a sequence (or continuum) of interpretable untestable assumptions of increasing plausibility for the sensitivity analysis so that experts can have informed opinions about which are true. In this work, we argue that using appropriate statistical procedures when conducting a sensitivity analysis is crucial to drawing valid conclusions about a causal question and understanding what assumptions one would need to make to do so. Conducting a sensitivity analysis typically relies on estimating features of the unknown observed data distribution, and thus naturally leads to statistical problems about which optimality results are already known. We present a general template for efficiently estimating the bounds on the causal parameter resulting from a given untestable assumption. The sequence of assumptions yields a sequence of confidence intervals which, given a suitable statistical procedure, attain proper coverage for the causal parameter if the corresponding assumption is true. We illustrate the pitfalls of an inappropriate statistical procedure with a toy example, and apply our approach to data from the Western Collaborative Group Study to show its utility in practice.

Computerizing Efficient Estimation of a Pathwise Differentiable Target Parameter

Mark J. van der Laan et al. — Mon, 27 Jul 2015 12:08:52 PDT

Frangakis et al. (2015) proposed a numerical method for computing the efficient influence function of a parameter in a nonparametric model at a specified distribution and observation (provided such an influence function exists). Their approach is based on the assumption that the efficient influence function is given by the directional derivative of the target parameter mapping in the direction of a perturbation of the data distribution defined as the convex line from the data distribution to a pointmass at the observation. In our discussion paper Luedtke et al. (2015) we propose a regularization of this procedure and establish the validity of this method in great generality. In this article we propose a generalization of the latter regularized numerical delta method for computing the efficient influence function for general statistical models, and formally establish its validity under appropriate regularity conditions. Our proposed method consists of applying the regularized numerical delta-method for nonparametrically-defined target parameters proposed in Luedtke et al. 2015 to the nonparametrically-defined maximum likelihood mapping that maps a data distribution (normally the empirical distribution) into its Kullback-Leibler projection onto the model. This method formalizes the notion that an algorithm for computing a maximum likelihood estimator also yields an algorithm for computing the efficient influence function at a user-supplied data distribution. We generalize this method to a minimum loss-based mapping. We also show how the method extends to compute the higher-order efficient influence function at an observation pair for higher-order pathwise differentiable target parameters. Finally, we propose a new method for computing the efficient influence function as a whole curve by applying the maximum likelihood mapping to a perturbation of the data distribution with score equal to an initial gradient of the pathwise derivative. We demonstrate each method with a variety of examples.

Drawing Valid Targeted Inference When Covariate-adjusted Response-adaptive RCT Meets Data-adaptive Loss-based Estimation, With An Application To The LASSO

Wenjing Zheng et al. — Thu, 09 Jul 2015 09:52:24 PDT

Adaptive clinical trial design methods have garnered growing attention in the recent years, in large part due to their greater flexibility over their traditional counterparts. One such design is the so-called covariate-adjusted, response-adaptive (CARA) randomized controlled trial (RCT). In a CARA RCT, the treatment randomization schemes are allowed to depend on the patient’s pre-treatment covariates, and the investigators have the opportunity to adjust these schemes during the course of the trial based on accruing information (including previous responses), in order to meet a pre-specified optimality criterion, while preserving the validity of the trial in learning its primary study parameter.

In this article, we propose a new group-sequential CARA RCT design and corresponding analytical procedure that admits the use of flexible data-adaptive techniques. The proposed design framework can target general adaption optimality criteria that may not have a closed-form solution, thanks to a loss- based approach in defining and estimating the unknown optimal randomization scheme. Both in predicting the conditional response and in constructing the treatment randomization schemes, this framework uses loss-based data-adaptive estimation over general classes of functions (which may change with sample size). Since the randomization adaptation is response-adaptive, this innovative flexibility potentially translates into more effective adaptation towards the optimality criterion. To target the primary study parameter, the proposed analytical method provides robust inference of the parameter, despite arbitrarily mis-specified response models, under the most general settings.

Specifically, we establish that, under appropriate entropy conditions on the classes of functions, the resulting sequence of randomization schemes converges to a fixed scheme, and the proposed treatment effect estimator is consistent (even under a mis-specified response model), asymptotically Gaussian, and gives rise to valid confidence intervals of given asymptotic levels. Moreover, the limiting randomization scheme coincides with the unknown optimal randomization scheme when, simultaneously, the response model is correctly specified and the optimal scheme belongs to the limit of the user-supplied classes of randomization schemes. We illustrate the applicability of these general theoretical results with a LASSO- based CARA RCT. In this example, both the response model and the optimal treatment randomization are estimated using a sequence of LASSO logistic models that may increase with sample size. It follows immediately from our general theorems that this LASSO-based CARA RCT converges to a fixed design and yields consistent and asymptotically Gaussian effect estimates, under minimal conditions on the smoothness of the basis functions in the LASSO logistic models. We exemplify the proposed methods with a simulation study.

One-Step Targeted Minimum Loss-based Estimation Based on Universal Least Favorable One-Dimensional Submodels

Mark J. van der Laan — Tue, 16 Jun 2015 16:37:54 PDT

Consider a study in which one observes n independent and identically distributed random variables whose probability distribution is known to be an element of a particular statistical model, and one is concerned with estimation of a particular real valued pathwise differentiable target parameter of this data probability distribution. The canonical gradient of the pathwise derivative of the target parameter, also called the efficient influence curve, defines an asymptotically efficient estimator as an estimator that is asymptotically linear with influence curve equal to the efficient influence curve.The targeted maximum likelihood estimator is a two stage estimator obtained by constructing a so called least favorable parametric submodel through an initial estimator with score, at zero fluctuation of the initial estimator, that spans the efficient influence curve, and iteratively maximizing the corresponding parametric likelihood till no more updates occur, at which point the updated initial estimator solves the so called efficient influence curve equation. The latter property establishes the asymptotic efficiency of the TMLE under appropriate conditions, including that the initial estimator is within a neighborhood of the true data distribution.

In this article we construct a one-dimensional universal least favorable submodel for which the TMLE only takes one step, and thereby requires minimal extra fitting with data to achieve its goal of solving the efficient influence curve equation. We generalize these to universal least favorable submodels through the relevant part of the data distribution as required for targeted minimum loss-based estimation, and to universal score-specific submodels for solving any other desired equation beyond the efficient influence curve equation. We demonstrate the one-step targeted minimum loss-based estimators based on such universal least favorable submodels for a variety of examples showing that any of the goals for TMLE we previously achieved with local (typically multivariate) least favorable parametric submodels and an iterative TMLE can also be achieved with our new one-dimensional universal least favorable submodels, resulting in new one-step TMLEs for a large class of estimation problems previously addressed. Finally, remarkably, given a multidimensional target parameter, we develop a universal canonical one-dimensional submodel such that the one-step TMLE, only maximizing the log-likelihood over a univariate parameter, solves the multivariate efficient influence curve equation. This allows us to construct a one-step TMLE based on a one-dimensional parametric submodel through the initial estimator, that solves any multivariate desired set of estimating equations.