The Probability Set Function

P: B -> R

B is a sigma-algebra on C.

Properties:
P(A) >= 0 \forall A in B
P(C) = 1
\forall A1, A2, A3, ... in B, if A_i \cap A_j = \empty,
P(infinite union of A1, A2, ...) = sum over all j of P(A_j)

Useful inequalities:
Boole's inequality (th 1.3.7) P(union of A1, A2, ...) = P(A1) + P(A2) + ...
(Derives from the inclusion-exclusion formula)

    Conditional Probability

Let A, B be sets in \B (Borel Algebra)
Assume P(B) > 0 [because it wouldn't make sense to condition on an
impossible event]

P(A | B) = P(A \cap B) / P(B)

P(* | B) : B -> R [that's a new notation]

Gives similar properties to the main probability function because it is
a probability set function.

P(A | B) >= 0
P(C | B) = 1 [and P(B | B) = 1 ]
P(* | B) is z-additive

Sometimes, it's simpler to define P(A \cap B) = P(A | B) * P(B) like in
a Markov chain.
P(A \cap B_1 \cap B_2) = P(A | B_1 \cap B_2)P(B_1 | B_2)P(B_2).
    Trivially proved by induction.

The law of total probability.

Consider B_1, B_2, ... in B such that any B_i, B_j are disjoint and the
union of all B_1 to B_\infty = C.

If P(B_i) > 0, P(A) = \sum^infty P(A | B_i) * P(B_i)

    Proof

For any i >= 1,
P(A | B_i) * P(B_i) = P(A \cap B_i) [basic property of conditionals]
A = A \cap C = A \cap (countable union of B_i) = (countable union of A
\cap B_i).
\to P(A) = P(countable union of A \cap B_i)
\to P(A) = (countable sum of P(A | B_i)*P(B_i))

    Bayes' Theorem
P(B_i | A) = P(A | B_i) * P(B_i) / (sum over all B_j P(A | B_j)*P(B_j))

Applies the law of total probability and the definition of conditional
probability.