In this article, we propose a new group-sequential CARA RCT design and corresponding analytical procedure that admits the use of flexible data-adaptive techniques. The proposed design framework can target general adaption optimality criteria that may not have a closed-form solution, thanks to a loss- based approach in defining and estimating the unknown optimal randomization scheme. Both in predicting the conditional response and in constructing the treatment randomization schemes, this framework uses loss-based data-adaptive estimation over general classes of functions (which may change with sample size). Since the randomization adaptation is response-adaptive, this innovative flexibility potentially translates into more effective adaptation towards the optimality criterion. To target the primary study parameter, the proposed analytical method provides robust inference of the parameter, despite arbitrarily mis-specified response models, under the most general settings.

Specifically, we establish that, under appropriate entropy conditions on the classes of functions, the resulting sequence of randomization schemes converges to a fixed scheme, and the proposed treatment effect estimator is consistent (even under a mis-specified response model), asymptotically Gaussian, and gives rise to valid confidence intervals of given asymptotic levels. Moreover, the limiting randomization scheme coincides with the unknown optimal randomization scheme when, simultaneously, the response model is correctly specified and the optimal scheme belongs to the limit of the user-supplied classes of randomization schemes. We illustrate the applicability of these general theoretical results with a LASSO- based CARA RCT. In this example, both the response model and the optimal treatment randomization are estimated using a sequence of LASSO logistic models that may increase with sample size. It follows immediately from our general theorems that this LASSO-based CARA RCT converges to a fixed design and yields consistent and asymptotically Gaussian effect estimates, under minimal conditions on the smoothness of the basis functions in the LASSO logistic models. We exemplify the proposed methods with a simulation study.

]]>In this article we construct a one-dimensional universal least favorable submodel for which the TMLE only takes one step, and thereby requires minimal extra fitting with data to achieve its goal of solving the efficient influence curve equation. We generalize these to universal least favorable submodels through the relevant part of the data distribution as required for targeted minimum loss-based estimation, and to universal score-specific submodels for solving any other desired equation beyond the efficient influence curve equation. We demonstrate the one-step targeted minimum loss-based estimators based on such universal least favorable submodels for a variety of examples showing that any of the goals for TMLE we previously achieved with local (typically multivariate) least favorable parametric submodels and an iterative TMLE can also be achieved with our new one-dimensional universal least favorable submodels, resulting in new one-step TMLEs for a large class of estimation problems previously addressed. Finally, remarkably, given a multidimensional target parameter, we develop a universal canonical one-dimensional submodel such that the one-step TMLE, only maximizing the log-likelihood over a univariate parameter, solves the multivariate efficient influence curve equation. This allows us to construct a one-step TMLE based on a one-dimensional parametric submodel through the initial estimator, that solves any multivariate desired set of estimating equations. ]]>

For that purpose we propose a new online one-step estimator, which is proven to be asymptotically efficient under regularity conditions. This estimator takes as input online estimators of the relevant part of the data generating distribution and the nuisance parameter that are required for efficient estimation of the target parameter. These estimators could be an online stochastic gradient descent estimator based on large parametric models as developed in the current literature, but we also propose other online data adaptive estimators that do not rely on the specification of a particular parametric model.

We also present a targeted version of this online one-step estimator that presumably minimizes the one-step correction and thereby might be more robust in finite samples. These online one-step estimators are not a substitution estimator and might therefore be unstable for finite samples if the target parameter is borderline identifiable.

Therefore we also develop an online targeted minimum loss-based estimator, which updates the initial estimator of the relevant part of the data generating distribution by updating the current initial estimator with the new block of data, and estimates the target parameter with the corresponding plug-in estimator. The online substitution estimator is also proven to be asymptotically efficient under the same regularity conditions required for asymptotic normality of the online one-step estimator.

The online one-step estimator, targeted online one-step estimator, and online TMLE is demonstrated for estimation of a causal effect of a binary treatment on an outcome based on a dynamic data base that gets regularly updated, a common scenario for the analysis of electronic medical record data bases.

Finally, we extend these online estimators to a group sequential adaptive design in which certain components of the data generating experiment are continuously fine-tuned based on past data, and the new data generating distribution is then used to generate the next block of data.

]]>