100 challenges, from race conditions on your laptop to consensus across continents.
Threads, locks, channels, async, event loops — then we step off one machine. CAP, replication, consistency, consensus, fault tolerance. Every runnable demo lets you pick your language (Go, Python, Rust, or Node.js) from the code tabs, with compile-and-run instructions included.
Built by Lakshya Kumar
Paste this into any AI chat. Fill in the bracketed parts with your context — you'll get back a straight answer on whether this belongs on your plate.
I'm considering a "Concurrency & Distributed Systems" course. It covers threads, locks, channels, async/await, event loops, then moves off one machine: replication, consistency models (CAP, linearizable, eventual), consensus (Raft), fault tolerance, and chaos testing. Demos are available in Go, Python, Rust, and Node.js via code tabs. Context about me: 1. My current role/focus: [e.g. "backend dev at a 50-person startup", "self-taught, building side projects", "senior frontend engineer who wants to cross over"] 2. The hardest concurrency bug I've hit so far was: [describe it, or say "I haven't really hit one"] 3. What I'm hoping this course changes about me: [e.g. "stop being scared of multi-threaded code", "understand my team's Kafka setup", "get promoted to senior"] Answer these: - For my background, which 2 modules would have the highest immediate payoff in the next 3 months, and why? - Name a concrete failure I've probably already caused (or will soon) that this course would prevent. - Is a 30-hour investment worth it for me, or should I learn something more specific first (e.g. SQL perf, networking)? Give your honest pick and reason. - What should I explicitly NOT expect to get out of this course — e.g. Kubernetes ops, specific framework mastery?
We grant free access case-by-case — students, career-switchers, builders on a tight budget. Sign in to send us a note.
Sign in to applyComplete all modules, then submit the required number of capstone projects. Each must earn a passing rating from an admin reviewer.
Build a small service (e.g. a URL shortener or counter) that survives a node failure. Use replication, a consensus library (etcd/Raft), or a manual primary-backup setup. Prove it works with a chaos test.
Build a multi-stage producer-consumer pipeline (3 stages, each with a bounded queue) in your language of choice. Include backpressure, graceful shutdown, and a load test showing throughput is bounded by the slowest stage. Profile and identify the bottleneck.
Implement a distributed lock using Redis (or etcd) with fencing tokens. Demonstrate the classic 'GC pause' failure mode where a lock holder pauses, the lock expires, and another holder acquires it. Show that fencing tokens prevent stale-holder writes from corrupting state.
Implement Raft leader election from scratch in your language of choice. Run a 5-node cluster, partition the network in 3 ways (split brain, isolated leader, slow node), and verify the cluster converges to a single leader within the election timeout in every case.
Implement a saga for a multi-step distributed transaction (order: reserve inventory, charge payment, ship). Include compensating transactions for each failure point. Inject a failure at each step and verify the saga always converges to a consistent state.
Accessible paper summaries. Skim after Module 6 to see where the field came from.