TMLE of a treatment-specific multivariate survival curve

This post is part of our Q&A series.

A question from graduate students in our Spring 2021 offering of the new course “Targeted Learning in Practice” at UC Berkeley:

Question:

Hi Mark,

I have a survival analysis question. I am working with a dataset that is left- and right-truncated. I am interested in estimating the treatment-specific multivariate survival function of a time-to-event variable. For example, a study where subjects have been randomized to two different treatment groups with baseline covariates W, but we only observe the outcome – time at death – for a left- and right-truncated window. Is it possible to use Targeted Maximum Likelihood Estimation (TMLE) for estimating the treatment-specific multivariate survival curve?

I have seen a few papers using TMLE for right-censored data, but I assume there are important considerations when working with doubly-truncated data.

Best,

C.B.


Answer:

Hi C.B.,

Thank you for the excellent question. You are asking about estimation of a treatment-specific survival curve when we have a time window and a subject is only part of the sample if a particular event such as death does not occur before the start of the window, so the sample is conditional on T>Cl, or we only observe units when T<Cl, for some truncation random variable Cl and time until event T.

In another post, I talk more about this problem of left-truncation and censoring. I will refer you to that for your question as well. Either way, yes, TMLE can be applied for any estimation problem, so it is just a matter of establishing the identification of the full-data distribution P_X$ from observing O=Φ(C,X) from a conditional distribution of T>Cl, say, thereby handling both the biased sampling due to sampling conditional on T>Cl as well as the more regular right-censoring, etc., making up a censored data structure Φ(C,X). For example, one might be able to assume Cl is independent of X, conditional on measured variables, and show that a conditional distribution of X, given T>Cl, implies the distribution of the full-data random variable X or a large part of it, so that a two-stage identification, first identifying PX fromPXT>Cl and then identifying PXT>Cl from P of O=ϕ(C,X) given T>Cl. Once we have done that, we can map the target quantity ΨF(PX) into an estimand Ψ(P), specify the statistical model, and then we are ready to apply TMLE.

Best Wishes,

Mark

P.S., remember to write in to our blog at vanderlaan (DOT) blog [AT] berkeley (DOT) edu. Interesting questions will be answered on our blog!