100 challenges — from reading a curl waterfall to finding the index that makes your p99 drop from 4s to 40ms.
A diagnostic course built around the question 'why is this API slow?' Every challenge gives you a clue, a measurement technique, or a broken system to fix. You'll build fluency in the vocabulary (TTFB, turnaround time, p99, error budget), the instruments (curl -w, Server-Timing headers, Chrome DevTools waterfall, EXPLAIN), and the systematic approach (network first, then DB, then code, then infrastructure). By Module 10 you will have a repeatable playbook for any slow endpoint.
Built by Lakshya Kumar
Paste this into any AI chat. Fill in the bracketed parts with your context — you'll get back a straight answer on whether this belongs on your plate.
We grant free access case-by-case — students, career-switchers, builders on a tight budget. Sign in to send us a note.
Sign in to applyComplete all modules, then submit the required number of capstone projects. Each must earn a passing rating from an admin reviewer.
Pick any API (your own project, an open-source backend, or a public API). Run a full performance audit: baseline the p50/p95/p99 with your benchmark script, identify the single biggest bottleneck using the techniques from the course, fix it, and prove the improvement with a before/after benchmark. Deliver a written report that names every tool used, every suspect eliminated, and the measured result.
I'm considering a "Debugging Slow APIs" course. It starts with measurement vocabulary (TTFB, p99, SLOs), works through the network layer, database bottlenecks, server-side profiling, caching, external dependencies, serverless cold starts, observability, and load testing, and finishes with a systematic debugging playbook. Context about me: 1. My current role: [e.g. "backend dev", "full-stack engineer", "DevOps/SRE", "frontend dev who gets blamed for slow APIs"] 2. The slowest API problem I've personally dealt with: [e.g. "never debugged one", "a query that took 8s with no indexes", "a cold start on Lambda that ruined our checkout"] 3. What I'm hoping this changes: [e.g. "I can diagnose any slow API in under 30 minutes", "I stop guessing and start measuring", "I can write SLOs my team actually uses"] Answer these: - For my background, which module will give me the fastest ROI in the next month, and why? - Name one concrete thing I'll be able to do after this course that I can't do today. - Is there a faster path for someone who only cares about one layer (e.g. just DB, just network)? - What will I NOT learn here that I might expect? (e.g. "you will not learn how to provision infrastructure", "you will not learn Kubernetes")
Find an API endpoint with N+1 queries (yours or from a sample app). Implement at least three solutions (DataLoader, prefetch joins, query batching) and benchmark each. Produce a report showing latency reduction at P50/P95/P99 and the cost trade-offs of each approach.
Take an API with P99 latency more than 5x its P50. Identify the cause via profiling (GC pauses, lock contention, slow downstream, large response). Implement at least three optimizations, measure the result, and prove the new P99 is within 2x of P50.
Design and implement a 3-tier cache (in-process, Redis, CDN) for a high-traffic endpoint. Include cache invalidation strategy, stampede protection (singleflight), and a measurement showing cache hit rate, origin load reduction, and stale-while-revalidate behavior.
Build a rate limiter for an API: token-bucket per user + global, distributed via Redis with Lua script for atomicity. Include burst handling, fair-share across tiers, an admin-overridable allowlist, and a 10k-rps load test demonstrating correct throttling without false positives.
Free online. The network-layer module draws heavily from chapters 2, 4, and 9.