Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
The birthday paradox is the canonical 'my intuition is wrong' result and the gateway to a class of real engineering hazards: hash collisions, UUID truncation, randomly-allocated IDs, cryptographic salts, load-balancer keying. With just 23 people, the probability that two share a birthday is over 50% — not 23/365 as your gut says. Builders who do not internalise this ship 32-bit IDs into systems that will see millions of records and act surprised when collisions appear. The math is two lines. Let it stick.
With people and equally likely birthdays, the probability of at least one collision is one minus the probability that all are distinct. Distinct birthdays form a falling factorial; the closed form has a beautifully fast crossover.
Use these three in order. Each builds on the one before.
In one paragraph, explain why the birthday paradox is counterintuitive. Why does naive thinking compare $n$ against 365, when the actual relevant quantity is the number of *pairs*, $\binom{n}{2}$?
Walk me through deriving the exact formula $P(\text{collision}) = 1 - \prod_{k=0}^{n-1}(1 - k/d)$ from the multiplication rule and complementary counting. Then derive the approximation $P \approx 1 - e^{-n(n-1)/(2d)}$ using $1 - x \approx e^{-x}$.
I am designing an ID scheme for a system that will hold $10^9$ records and I want collision probability under $10^{-9}$. How many bits of ID do I need? Walk through the calculation using the birthday bound and explain why it is much larger than $\log_2 10^9 \approx 30$ bits.
// main.go
package main
import (
"fmt"
"math/rand"
)
func birthdayCollision(n, d int) float64 {
p := 1.0
for k := 0; k < n; k++ {
p *= 1.0 - float64(k)/float64(d)
}
return 1 - p
}
func main() {
for _, n := range []int{10, 23, 30, 50, 70} {
fmt.Printf("n=%3d P(collision) = %.4f\n", n, birthdayCollision(n, 365))
}
// Monte Carlo verification at n = 23
rng := rand.New(rand.NewSource(0))
N := 200_000
hits := 0
for i := 0; i < N; i++ {
seen := make(map[int]bool)
collision := false
for j := 0; j < 23; j++ {
b := rng.Intn(365) + 1
if seen[b] {
collision = true
break
}
seen[b] = true
}
if collision {
hits++
}
}
fmt.Printf("Monte Carlo P(collision @ n=23) = %.4f\n", float64(hits)/float64(N))
}
go run main.go