Adaptive sequential designs and optimal treatments · The Research Group of Mark van der Laan

Mark van der Laan 29 Nov 2018, 12:43

resources / statistics / targeted learning / Q&A

This post is part of our Q&A series.

A question from graduate students in our Fall 2018 offering of “Special Topics in Biostatistics – Adaptive Designs” at Berkeley:

Question:

Hi Mark,

Our question concerns the benefit of using a sequential adaptive design when estimating the outcome under the optimal dynamic treatment rule (for a binary treatment). We propose doing so in a 2-stage framework, where in the first stage subjects are naively randomized to treatment, . In the second stage subjects are randomized conditional on their covariates: such that they have a higher probability of receiving treatment if the conditional treatment effect estimate from the first stage suggests they will benefit from treatment. Extensions could account for the standard error of the estimated .

Is there any theory showing the benefits of using an adaptive sequential design for this target causal parameter? The expected outcome for subjects in the trial will certainly be higher in this design than in one where treatment is always randomized naively – which offers a benefit to participants in a trial.

However, from an estimation and efficiency perspective, it’s not clear to us that this design will always offer improvement over a traditional design. To take an extreme example, if subjects in the second stage were assigned deterministically to the estimated optimal treatment (without experimentation), then estimation would suffer. It appears that at some point the reduction in experimentation should affect estimation and inference, potentially through positivity issues.

Do you have thoughts on this?

-J.R. and L.M.

Answer:

Hi J.R. and L.M.,

This is an excellent question. Interestingly, I believe that these adaptive designs if well run do not only benefit the patients themselves by receiving beneficial treatment with higher probability if past data suggests evidence for this. In addition, I believe that it will also optimize information for estimation of the expectation of the counterfactual outcome under the current best estimate of the dynamic treatment, i.e., . May be I should say, of the expectation of the counterfactual outcome under the current estimate of the stochastic intervention that approximates the optimal dynamic treatment as sample size increases, i.e., .

The reason is that if I am sampling my treatment from g, then my design is optimal for purpose of estimation of . More generally, if the treatment mechanism is and is a measure of how well we will do in estimating – i.e., one does want to make the actual design as close as possible to the intervention one wants to evaluate. In this case, our changes with , presumably gradually. Nonetheless, under gradual changes of as function of , one would think that the data sampled – i.e., the used treatment mechanism is much closer to drawing from than it is from drawing treatment independently with probability 0.5. Here you see once again how important it is to make the adaptive design converge smoothly and wisely, that is, do not adapt based on noise, adapt slowly based on evidence. However even when making rough changes due to two stage trial design, i.e., first stage 0.5, and second stage based on perturbation of estimator of optimal rule, the treatment mechanism will still resemble more our estimate than using 0.5.

Since is approximating the estimated optimal rule I would also argue that the adaptive design will get a more precise estimator of as well than one would obtain with an 0.5 fixed design. Formally, one compares the asymptotic variance – where is the efficient influence curve of of the TMLE of under our adaptive design, with the asymptotic variance where is the simple RCT fixed design – i.e., one compares asymptotic variances of the TMLE under the two designs and we know what they are.

So, after having run the adaptive design, so that you have , you can make a formal comparison of these two asymptotic variances in your simulation (wherein is known), demonstrating a relative efficiency of the two designs for learning ; similarly, for . Of course, in a simulation, you can actually repeat the adaptive design many times and obtain a sampling distribution of the TMLE around the data adaptive for this adaptive design and compare its spread with the sampling distribution of the TMLE of under the 0.5 fixed design. You also learn from this that if you would respond to noise and thereby set equal to some noisy estimate, then the ’s are not representative of the final or (and it will not be good). Similarly, if one does not allow for experimentation – i.e., one samples deterministically from the current estimator of optimal dynamic treatment, then things will be bad as well: in that case, we have huge positivity issues, that is, will now blow up since the factor in front of the residual – where is deterministic, will blow up.

So clearly how one adapts is fundamental at both the level of eventually learn the optimal rule, but also about how to generate much information for learning the mean outcome under the final estimate of the optimal rule, or its final stochastic approximation representing the adaptive design at last observation . The same story applies to the adaptive designs that adapt continuous in time as we covered in class.

Best Wishes,

Mark

P.S., remember to write in to our blog at vanderlaan (DOT) blog [AT] berkeley (DOT) edu. Interesting questions will be answered on our blog!