Mathematics intro

Probabilities, Conditional Independence and Assumption Parlance

Wouter van Amsterdam

2025-07-07

Marginal, Joint and Conditional probabilities

Setup

Probability statements about random events \(A\) and \(B\)

\(A\): patient dies (\(A=1\))
\(B\): patient has cancer (\(B=1\))

Say we have 100 patients, we can tabulate them according to their cancer status and whether they died or not.

joint probability table
		A
		dies	lives
B	has cancer	5	5	10
	has no cancer	10	80	90
		15	85	100

Marginal probabilities

Marginal probabilities concern probabilities of one random event, regardless of the other random event.
We read these probabilities from the margins of the joint probability table.

statement	interpretation
\(P(A=1)\)	marginal probability that event \(A\) occurs
\(P(B=1)\)	marginal probability that event \(B\) occurs

		A
		dies	lives
B	has cancer	5	5	10
	has no cancer	10	80	90
		15	85	100

		A
		dies	lives
B	has cancer
	has no cancer
		15	85	100

\(P(A=1) = 15 / 100\)

		A
		dies	lives
B	has cancer			10
	has no cancer			90
				100

\(P(B=1) = 10 / 100\)

Joint Probabilities

A joint probability concerns the probability of two random events jointly occurring together.
These are a based on a single cell in the joint probability table

statement	interpretation
\(P(A)\)	marginal probability that event \(A\) occurs
\(P(A=1,B=1)\)	joint probability of \(A\) and \(B\)

		A
		dies	lives
B	has cancer	5	5	10
	has no cancer	10	80	90
		15	85	100

		A
		dies	lives
B	has cancer	5
	has no cancer
				100

\(P(B=1,A=1) = 5 / 100\)

Conditional probabilities

Conditional probabilities concern the probability of one random event given that another random event has occurred.
e.g. what is the probability that a patient dies (\(A=1\)) given that they have cancer (\(B=1\))?
These are read from the joint probability table by looking in the row or column of the conditioning event.

statement	interpretation
\(P(A)\)	marginal probability that event \(A\) occurs
\(P(A,B)\)	joint probability of \(A\) and \(B\)
\(P(A=1\|B=1)\)	conditional probability of \(A\) given \(B\)

		A
		dies	lives
B	has cancer	5	5	10
	has no cancer	10	80	90
		15	85	100

- marginal \(P(A=1) = 15/100\)

		A
		dies	lives
B	has cancer	5	5	10
	has no cancer

- marginal \(P(A=1) = 15/100\)

- conditional \(P(A=1|B=1) = 5 / 10\)

Probability rules and identities

Sum rule

A marginal probability can be computed by summing over the joint probabilities of all possible values of the other random event.

statement	interpretation
\(P(A) = \sum_{b} P(A,B=b)\)	marginal is sum over joint

		A
		dies	lives
B	has cancer	5
	has no cancer	10
		15		100

\[\begin{align} P(A=1) &= P(A=1,B=1) + P(A=1,B=0) \\ &= 5/100 + 10/100 \\ & = 15/100 \end{align}\]

Product rule

A joint probability can be computed by multiplying the conditional probability of one random event given the other random event with the marginal probability of the other random event.

statement	interpretation
\(P(A) = \sum_{b} P(A,B=b)\)	marginal is sum over joint
\(P(A,B) = P(A\|B)P(B)\)	product rule

		A
		dies	lives
B	has cancer	5		10
	has no cancer
				100

\[\begin{align} P(A=1,B=1) &= P(A=1|B=1)P(B=1) \\ &= 5/10 * 10/100 \\ & = 5/100 \end{align}\]

With these two rules and basic algebra, we can derive more identities

statement	interpretation
\(P(A) = \sum_{b} P(A,B=b)\)	marginal is sum over joint
\(P(A,B) = P(A\|B)P(B)\)	product rule

Product rule - different form

a conditional probability can be computed by dividing the joint probability of the two random events by the marginal probability of the other random event, since¹

\[ x = y * z \implies y = \frac{x}{z} \]

statement	interpretation
\(P(A) = \sum_{b} P(A,B=b)\)	marginal is sum over joint
\(P(A,B) = P(A\|B)P(B)\)	product rule
\(P(A\|B) = \frac{P(A,B)}{P(B)}\)	conditional is joint over marginal (follows from product rule)

Law of total probability

statement	interpretation
\(P(A) = \sum_{b} P(A,B=b)\)	marginal is sum over joint
\(P(A,B) = P(A\|B)P(B)\)	product rule
\(P(A\|B) = \frac{P(A,B)}{P(B)}\)	conditional is joint over marginal (follows from product rule)
\(P(A\|C) = \sum_{b} P(A\|B=b,C)P(B=b\|C)\)	total probability (consequence of marginal vs joint and product rule)

this identity can be proven quite easily using the product rule and the sum rule Section 5.1

Marginal independence and conditional independence

Marginal independence

statement	interpretation
\(P(A,B) = P(A)P(B)\)	(marginal) independence of \(A\) and \(B\)

knowing \(A\) has no information on what to expect of \(B\)
If I roll a die, the result of that die (\(A\)) has no information on the weather in the Netherlands (\(B\))

Conditional independence

some events may not be independent in general, but they may be independent given some other event \(C\).
statement:

\[P(A,B|C) = P(A|C)P(B|C)\]

Conditional Independence in an example

Charlie calls Alice and reads her script \(C\), then she calls Bob and reads him the same
A week later we ask Alice to repeat the story Charlie told her, she remembered \(A\), a noisy version of \(C\)
We ask Bob the same, he recounts \(B\), a different noisy version of \(C\)
Are \(A\) and \(B\) independent? No! \(P(A,B) \neq P(A)P(B)\)
- If we learn \(A\) from Alice, we can get a good guess about \(B\) from Bob
If we knew \(C\), would hearing \(A\) give us more information about \(B\)?
- No, because all the shared information between \(A\) and \(B\) is explained by \(C\), so:
- \(P(A,B) \neq P(A)P(B)\)
- \(P(A,B|C) = P(A|C)P(B|C)\)
Variables can be marginally dependent but conditionally independent (and vice-versa)

Conditional independence, stated differently

\[P(A|B,C) = P(A|C)\]

statement	interpretation
\(P(A,B) = P(A)P(B)\)	(marginal) independence of \(A\) and \(B\)
\(P(A,B\|C) = P(A\|C)P(B\|C)\)	conditional independence of \(A\) and \(B\) given \(C\)
\(P(A\|B,C) = P(A\|C)\)	conditional independence of \(A\) and \(B\) given \(C\)

both statements of conditional independence can be shown to be equivalent (when the involved conditional probabilities are well-defined) Section 5.2

Assumption parlance

Hierarchy of conditions / assumptions

necessary assumption:
- A must hold for B to be true
- having a heart is necessary for having a heart rate
sufficient assumption:
- B is always true when A holds
- Being a square is sufficient to be a rectangle
strong assumption:
- requires strong evidence, we’d rather not make these
weak assumption:
- requires weak evidence
strong vs weak assumption are judged on relative terms
- if assumption A is sufficient for B, B cannot be a stronger assumption than A

Proofs

Law of total (conditional) probability

We are asked to prove:

\[P(A \mid C) = \sum_b P(A \mid B = b, C) \, P(B = b \mid C)\]

\[\begin{align} P(A \mid C) &= \sum_b P(A, B = b \mid C) && \text{(sum rule)} \\ &= \sum_b P(A \mid B = b, C) \, P(B = b \mid C) && \text{(product rule)} \end{align}\]

Conditional independence equivalent statements

We will prove that the conditional independence statement

\[ P(A \mid B, C) = P(A \mid C) \]

is equivalent to

\[ P(A, B \mid C) = P(A \mid C) \cdot P(B \mid C) \]

using basic rules of probability.

Conditional independence equivalent statements

✅ Proof (⇐ direction):

Assume

\[ P(A, B \mid C) = P(A \mid C) \cdot P(B \mid C) \]

By the product rule,

\[ P(A, B \mid C) = P(A \mid B, C) \cdot P(B \mid C) \]

Comparing both expressions:

\[ P(A \mid B, C) \cdot P(B \mid C) = P(A \mid C) \cdot P(B \mid C) \]

Divide both sides by \(P(B \mid C) > 0\), we get:

\[ P(A \mid B, C) = P(A \mid C) \]

Conditional independence equivalent statements

✅ Proof (⇒ direction):

Assume

\[ P(A \mid B, C) = P(A \mid C) \]

Again by the product rule:

\[ P(A, B \mid C) = P(A \mid B, C) \cdot P(B \mid C) \]

Substitute \(P(A \mid C)\) for \(P(A \mid B, C)\), we get:

\[ P(A, B \mid C) = P(A \mid C) \cdot P(B \mid C) \]

Conditional independence equivalent statements

✅ Conclusion:

\[ P(A \mid B, C) = P(A \mid C) \quad \Longleftrightarrow \quad P(A, B \mid C) = P(A \mid C) \cdot P(B \mid C) \]

as required.