introduction
2024-08-06
location | |||
---|---|---|---|
home | hospital | ||
risk | low | 648 / 720 = 90% | 19 / 20 = 95% |
high | 40 / 80 = 50% | 144 / 180 = 80% |
location | |||
---|---|---|---|
home | hospital | ||
risk | low | 648 / 720 = 90% | 19 / 20 = 95% |
high | 40 / 80 = 50% | 144 / 180 = 80% | |
marginal | 688 / 800 = 86% | 163 / 200 = 81.5% |
location | |||
---|---|---|---|
home | hospital | ||
bedrest | no | 648 / 720 = 90% | 19 / 20 = 95% |
yes | 40 / 80 = 50% | 144 / 180 = 80% | |
marginal | 688 / 800 = 86% | 163 / 200 = 81.5% |
pregnancy risk
) are referred to the hospital for deliverypregnancy risk
and hospital delivery
cause neonatal outcome
pregnancy risk
is a common cause of the treatment (hospital delivery
) and the outcome (this is called a confounder)bed rest
than those who remain at homebed rest
leads to lower recovery times thus less walking patients after 1 weekbed rest
is a mediator between the treatment (hospitalized
) and the outcomesprinkler on
may (or may not) cause wet floor
wet floor
cannot cause sprinkler on
Google Form https://bit.ly/dagquiz
assume this is our DAG for a situation and we want to learn the effect \(T\) has on \(Y\)
All we need is basic probability applied to the DAG
\[\begin{align} P(Y,T,Z) &= P(Y|T,Z)P(T,Z) \\ &= P(Y|T,Z)P(T|Z)P(Z) \end{align}\]
\[\begin{align} P_{\text{obs}}(Y,T,Z) &= P(Y|T,Z)\color{red}{P(T|Z)}P(Z) \end{align}\]
\[\begin{align} P_{\text{int}}(Y,T,Z) &= P(Y|T,Z)\color{green}{P(T)}P(Z) \end{align}\]
\[\begin{align} P_{\text{obs}}(Y,T,Z) &= P(Y|T,Z)\color{red}{P(T|Z)}P(Z) \\ P_{\text{obs}}(Y|T) &= \sum_{z} P(Y|T,Z=z)P(Z=z|T) \end{align}\]
\[\begin{align} P_{\text{int}}(Y,T,Z) &= P(Y|T,Z)\color{green}{P(T)}P(Z) \\ P_{\text{int}}(Y|T) &= \sum_{z} P(Y|T,Z=z)P(Z=z|T) \\ &\class{fragment}{= \sum_{z} P(Y|T,Z=z)\color{green}{P(Z)}} \\ &\class{fragment}{= P(Y|\text{do}(T))} \end{align}\]
\[P_{\text{obs}}(Y|T) = \sum_{z} P(Y|T,Z=z)\color{red}{P(Z=z|T)}\]
\[P_{\text{int}}(Y|T) = \sum_{z} P(Y|T,Z=z)\color{green}{P(Z=z)} \qquad(1)\]
location | |||
---|---|---|---|
home | hospital | ||
risk | low | 648 / 720 = 90% | 19 / 20 = 95% |
high | 40 / 80 = 50% | 144 / 180 = 80% | |
marginal | 688 / 800 = 86% | 163 / 200 = 81.5% |
\[\begin{align} P(\text{outcome}|\text{location} = \text{hospital}) &= 95 * 0.1 + 80 * 0.9 = 81.5\% \\ P(\text{outcome}|\text{location} = \text{home}) &= 90 * 0.9 + 50 * 0.1 = 86\% \end{align}\]
location | |||
---|---|---|---|
home | hospital | ||
risk | low | 648 / 720 = 90% | 19 / 20 = 95% |
high | 40 / 80 = 50% | 144 / 180 = 80% | |
marginal | 688 / 800 = 86% | 163 / 200 = 81.5% |
\[\begin{align} P(\text{outcome}|\text{do}(\text{hospital})) &= 95 * 0.74 + 80 * 0.26 = 91.1\% \\ P(\text{outcome}|\text{do}(\text{home})) &= 90 * 0.74 + 50 * 0.26 = 79.6\% \end{align}\]
is to take data we have to make inferences about data from a different distribution (i.e. the intervened-on distribution)
Bogie, James; Fleming, Michael; Cullen, Breda; Mackay, Daniel; Pell, Jill P. (2021). Full directed acyclic graph.. PLOS ONE. Figure. https://doi.org/10.1371/journal.pone.0249258.s003
paths
paths with conditioning variables \(r\), \(t\)
Definition 3.3.1 (Back-Door) (for pairs of variables)
A set of variables \(Z\) satisfies the back-door criterion relative to an ordered pair of variables \((X,Y)\) in a DAG if:
Theorem 3.2.2 (Back-Door Adjustment)
If a set of variables \(Z\) satisfies the back-door criterion relative to \((X,Y)\), then the causal effect of \(X\) on \(Y\) is identifiable and is given by the formula
\[P(y|\text{do}(x)) = \sum_z P(y|x,z)P(z) \qquad(2)\]
backdoor adjustment with \(z\) requires computing \(P(y|x,z)\)
by the product rule:
\[P(y|x,z) = \frac{P(y,x,z)}{P(x,z)}\]
this division is only defined when \(P(x,z) > 0\)
which is the same as the positivity assumption from Day 1 in Potential Outcomes
Wouter van Amsterdam — WvanAmsterdam — vanamsterdam.github.io