TMLE of a treatment-specific multivariate survival curve

This post is part of our Q&A series.

A question from graduate students in our Spring 2021 offering of the new course “Targeted Learning in Practice” at UC Berkeley:


Hi Mark,

I have a survival analysis question. I am working with a dataset that is left- and right-truncated. I am interested in estimating the treatment-specific multivariate survival function of a time-to-event variable. For example, a study where subjects have been randomized to two different treatment groups with baseline covariates $W$, but we only observe the outcome – time at death – for a left- and right-truncated window. Is it possible to use Targeted Maximum Likelihood Estimation (TMLE) for estimating the treatment-specific multivariate survival curve?

I have seen a few papers using TMLE for right-censored data, but I assume there are important considerations when working with doubly-truncated data.




Hi C.B.,

Thank you for the excellent question. You are asking about estimation of a treatment-specific survival curve when we have a time window and a subject is only part of the sample if a particular event such as death does not occur before the start of the window, so the sample is conditional on $T > C_l$, or we only observe units when $T < C_l$, for some truncation random variable $C_l$ and time until event $T$.

In another post, I talk more about this problem of left-truncation and censoring. I will refer you to that for your question as well. Either way, yes, TMLE can be applied for any estimation problem, so it is just a matter of establishing the identification of the full-data distribution P_X$ from observing $O = \Phi(C, X)$ from a conditional distribution of $T > C_l$, say, thereby handling both the biased sampling due to sampling conditional on $T > C_l$ as well as the more regular right-censoring, etc., making up a censored data structure $\Phi(C, X)$. For example, one might be able to assume $C_l$ is independent of $X$, conditional on measured variables, and show that a conditional distribution of $X$, given $T > C_l$, implies the distribution of the full-data random variable $X$ or a large part of it, so that a two-stage identification, first identifying $P_X$ from$P_{X \mid T > C_l}$ and then identifying $P_{X \mid T > C_l}$ from $P$ of $O = \phi(C, X)$ given $T > C_l$. Once we have done that, we can map the target quantity $\Psi^F(P_X)$ into an estimand $\Psi(P)$, specify the statistical model, and then we are ready to apply TMLE.

Best Wishes,


P.S., remember to write in to our blog at vanderlaan (DOT) blog [AT] berkeley (DOT) edu. Interesting questions will be answered on our blog!

comments powered by Disqus