Increased clinical interest in individualized ‘adaptive’ treatment policies has shifted the methodological focus for their development from the analysis of naturalistically observed strategies to experimental evaluation of a pre-selected set of strategies via multi-stage designs. Because multi-stage studies often avoid the ‘curse of dimensionality’ inherent in uncontrolled studies, and hence the need to parametrically smooth trial data, it is not surprising in this context to find direct connections among different methodological approaches. We show by asymptotic and algebraic proof that the maximum likelihood (ML) and optimal semi-parametric estimators of the mean of a treatment policy and its standard error are equal under certain experimental conditions. The two methodologies offer conceptually different formulations, which we exploit to develop a unified and efficient approach to design and inference for multi-stage trials of policies that adapt treatment according to discrete responses. We derive a sample size formula expressed in terms of a parametric (regression-based) version of the optimal semi-parametric population variance. Non-parametric (sample-based) ML estimation performed well in simulation studies, in terms of achieved power, even though sample sizes relied on parametric re-expression. For a variety of simulated scenarios, ML outperformed the semi-parametric approach, which used a priori rather than estimated randomization probabilities, because the test statistic was sensitive to even small differences arising in finite samples.


Clinical Trials