Adaptive designs and optimal subgroups

This post is part of our Q&A series.

A question from graduate students in our Fall 2018 offering of “Special Topics in Biostatistics – Adaptive Designs” at Berkeley:

Question:

Hi Mark,

We were interested in your opinion on few topics that have come up in class a few times.

If we isolate an optimal subgroup, we can, perhaps, answer interesting questions about, say, drug efficacy (as in, does this drug work for anybody as opposed to on average?). Having an adaptive design that then preferentially samples from this optimal group could be a good way to increase power and limit resource constraints. This imposes interesting statistical questions- how do we deal with such bias sampling within an adaptive sequential design, and how to generalize to a superpopulation? (it seems that, by design, we converge towards the mean for the optimal subgroup)?

On the other hand, perhaps ethical concerns regarding who we identified as an optimal subgroup could be alleviated by enforcing fairness constraints, or more elaborate “resource constraint” rules that limit how much of the resources are allocated to the optimal subgroup (wealthier families are most responsive, but perhaps not the best solution practically).

Along these lines, it seems that these type of ideas can be further extended to, for example, location (adaptability deciding where to sample next), or maybe even time?

We look forward to your insight!

Best,

I.M., N.H., and R.P.


Answer:

Hi I.M., N.H., and R.P.,

We still start out with a well defined full data random variable $X=(W, Y_0, Y_1)$ with some probability distribution $P_{X,0}$. In this case, the design under our control might involve sampling from a marginal distribution $g_W$ and a conditional distribution $g_A$ of $A$, given $W$.

$g_W$ is now not equal to the population distribution $Q_{W,0}$, but might be a biased sample version of that. Our observed data structure of $(W, A, Y) = (W, A, Y_A)$ has a probability distribution determined by $q_0$, conditional distribution of $Y_a$, given $W$, and $g=(g_W, g_A)$, and is given by

  1. Sample $W$ from $g_{W}$,
  2. sample $\{Y_0, Y_1\}$ from $q$,
  3. sample $A$ from $g_a$, and
  4. define $O = (W, A, Y = Y_A)$.

We could now define an oracle design for $g_0$, some $g_{q_0}$ that might be defined by $q_Y$.

For example, it might define $g_W$ as sampling from the conditional distribution of $W$, given $\mathbb{E}(Y_1 - Y_0 \mid W) > 0$, which corresponds with only sampling subjects for which there is a positive treatment effect, and one might define the oracle design $g_{A}$ as setting $A = 1$.

A target parameter of interest could be $\mathbb{E}Y_{g_0}$, i.e., the mean outcome of $Y$ if we would sample $O$ from $P_{q_0,g_0}$. I would probably first want to know how to estimate this quantity, maybe $$\mathbb{E}Y_{g_0} - \mathbb{E}Y_{g_W,A=0},$$ causal effect of treatment for optimal subgroup) – based on iid. sampling from a fixed design $P_{q_0,g_0}$, where $g_0$ could be the actual distribution of $W$ combined with a standard RCT for $g_A$. This now resembles estimation of the treatment effect among the optimal subgroup for a fixed design, and Alex Luedtke and I have a paper on that.

Such an iid estimator, e.g., an iid TMLE of $\mathbb{E}Y_{g*}$ for a given design $g*$ based on sampling from a fixed design $P_{q_0,g_0}$ would now be a basis for estimation in an adaptive design where $O_i \sim P_{q_0,g_{0,i}}$ is sampled using design $g_{0,i}$, changing with $i$, based on the available data at the time point at which the unit for $O_i$ needs to be sampled.

Such an iid TMLE of $\mathbb{E}Y_{g*}$ for an iid design $P_{q_0,g_0}$ would involve targeting an initial estimator of $q_0$ with a TMLE step using weighting (e.g., in the loss or in the clever covariate) by $\frac{g*}{g_0} = \frac{(g*_W g*_A)}{(g_{W,0} g_{A,0})}$.

We might now simply replace that weight by $\frac{g*}{g_i} = \frac{(g*_W g*_A)}{g_{W,i}g_{A,i}}$ to obtain an IPTW-TMLE for the adaptive design sampling $O_i$ from $P_{q_0,g_i}$. Generally, I do not see any reason why the whole theory for sequential adaptive design would not apply to this setting as well. It is an example of the general sequential adaptive design (see 2008 tech report) in which now $(W,A)$ are the design variables that are generated by the distribution set by the experimenter. In fact, in that technical report, I present as one example an adaptive design on the sampling of $X$ in the regression context.

Indeed, the choice of oracle design is in our hands, and, in particular, we can refine what we view as the target subgroup who need to be treated. For example, if treatment is always beneficial, we might define a subgroup by a resource constraint (e.g., only 30% can be treated). One could imagine that a ranking of the conditional treatment effect would result in the top 30% of covariate values that are not fair w.r.t. subgroups, and one could then decide to define a rule for treating that also takes into account fairness w.r.t. certain subgroups, while, under both a fairness and resource constraint, optimizing the mean outcome. That will then result in a different oracle design and the resulting adaptive design will then learn that particular oracle design…

Best Wishes,

Mark

P.S., remember to write in to our blog at vanderlaan (DOT) blog [AT] berkeley (DOT) edu. Interesting questions will be answered on our blog!

 
comments powered by Disqus