Causality in Prediction Research & Target Trial Emulation

AI methods lab seminar

Wouter van Amsterdam

2025-12-01

Today’s program

Questions with an element of ‘what if’

What will happen if we treat all patients with A (versus B)?

You’re a data scientist in a children’s hospital
Have data on
- delivery location (home or hospital)
- neonatal outcomes (good or bad)
- pregnancy risk (high or low)
Question: if we send all deliveries to the hospital, will neonatal outcomes improve?

percentage of good neonatal outcomes
		location
		home	hospital
risk	low	648 / 720 = 90%	19 / 20 = 95%

assumptions:
- women with high risk of bad neonatal outcomes (pregnancy risk) are referred to the hospital for delivery
- hospital deliveries lead to better outcomes for babies as more emergency treatments possible
- both pregnancy risk and hospital delivery cause neonatal outcome
the other variable pregnancy risk is a common cause of the treatment (hospital delivery) and the outcome (this is called a confounder)

Our question: what if we send all deliveries to the hospital?
In this hypothetical world, all deliveries (low risk and high risk) go to hospital (or home)
Can be observed in a Randomized Controlled Trial (RCT)
In the DAG: the arrow from pregnancy risk to hospital delivery should be removed

in our example, we can calculate the causal effect of hospital delivery on neonatal outcome by looking at the effect within levels of pregnancy risk
this is an example of a broader theme:
- we have non-experimental (observational) data
- would like to answer a question about an intervention, as we would observe in a randomized trial
- causal inference toolbox: express assumptions on our data
- derive how to estimate the causal effect from observational data (covariate adjustment, inverse probability weighting, etc)