Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
Every server starts fast and ends slow. The first bottleneck on a real service is almost never CPU — it's a synchronous call to something slow. The database. A downstream HTTP call. A file read. A synchronous log write. Your handler waits 50ms on the DB, your server's capacity just became '1000 / (50 / 1000) = 20 req/s per connection' for that endpoint. The art of scaling is finding those synchronous calls and deciding: parallelize, cache, async, or eliminate.
The first bottleneck on any real service is almost never CPU — it's a synchronous call to something slow: a database, a downstream HTTP request, a file read. When your handler blocks for 50ms waiting on the DB, your server's per-connection throughput becomes 20 req/s no matter how many CPUs you have. Profiling that one path shows exactly where time is spent and why the fix is async, not a bigger machine.
X-Timing-MS response header. Which step is dominating the time?wrk -c100 -d10s http://localhost:8080/users/42. Watch p99 latency.time.Sleep / setTimeout — inside the handler. Load test again. Predicted what happens?Use these three in order. Each builds on the one before.
Explain the phrase 'the first bottleneck is always I/O, not CPU' with a concrete example. Why does a 50ms database call dominate a handler that also does 500µs of JSON parsing?
Walk me through what happens to your process when a handler makes a synchronous DB call: syscall, waiting, return. In an event-loop runtime (Node/Python asyncio), why does a blocking call freeze everything, while an async call doesn't?
You have a handler that does A (10ms DB), B (20ms HTTP), C (50ms DB). All three are independent. Before any refactor, what's the handler's minimum possible latency and what's its actual? Walk me through the three techniques to bring actual closer to minimum: parallelization, pipelining, caching.
package main
import (
"database/sql"
"encoding/json"
"net/http"
"time"
_ "github.com/mattn/go-sqlite3"
)
func main() {
db, _ := sql.Open("sqlite3", ":memory:")
db.Exec(`CREATE TABLE u (id INTEGER, name TEXT)`)
db.Exec(`INSERT INTO u VALUES (42, 'Ada')`)
http.HandleFunc("GET /users/{id}", func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
// 1. parse (fast — microseconds)
id := r.PathValue("id")
// 2. DB query (slow — milliseconds to tens of ms)
var name string
db.QueryRow(`SELECT name FROM u WHERE id = ?`, id).Scan(&name)
// 3. render (fast — microseconds)
json.NewEncoder(w).Encode(map[string]any{"id": id, "name": name})
w.Header().Set("X-Timing-MS", formatMs(time.Since(start)))
})
http.ListenAndServe(":8080", nil)
}
func formatMs(d time.Duration) string {
return fmt.Sprintf("%d", d.Milliseconds())
}go run main.go