TMLE of a treatment-specific multivariate survival curve · The Research Group of Mark van der Laan

Mark van der Laan 01 May 2021, 17:54

resources / statistics / targeted learning / Q&A

This post is part of our Q&A series.

A question from graduate students in our Spring 2021 offering of the new course “Targeted Learning in Practice” at UC Berkeley:

Question:

Hi Mark,

I have a survival analysis question. I am working with a dataset that is left- and right-truncated. I am interested in estimating the treatment-specific multivariate survival function of a time-to-event variable. For example, a study where subjects have been randomized to two different treatment groups with baseline covariates , but we only observe the outcome – time at death – for a left- and right-truncated window. Is it possible to use Targeted Maximum Likelihood Estimation (TMLE) for estimating the treatment-specific multivariate survival curve?

I have seen a few papers using TMLE for right-censored data, but I assume there are important considerations when working with doubly-truncated data.

Best,

C.B.

Answer:

Hi C.B.,

Thank you for the excellent question. You are asking about estimation of a treatment-specific survival curve when we have a time window and a subject is only part of the sample if a particular event such as death does not occur before the start of the window, so the sample is conditional on , or we only observe units when , for some truncation random variable and time until event .

In another post, I talk more about this problem of left-truncation and censoring. I will refer you to that for your question as well. Either way, yes, TMLE can be applied for any estimation problem, so it is just a matter of establishing the identification of the full-data distribution P_X$ from observing from a conditional distribution of , say, thereby handling both the biased sampling due to sampling conditional on as well as the more regular right-censoring, etc., making up a censored data structure . For example, one might be able to assume is independent of , conditional on measured variables, and show that a conditional distribution of , given , implies the distribution of the full-data random variable or a large part of it, so that a two-stage identification, first identifying from and then identifying from of given . Once we have done that, we can map the target quantity into an estimand , specify the statistical model, and then we are ready to apply TMLE.

Best Wishes,

Mark

P.S., remember to write in to our blog at vanderlaan (DOT) blog [AT] berkeley (DOT) edu. Interesting questions will be answered on our blog!