Capstok — learn by doing

Why this matters

Before the deep dives, it helps to hold the whole map in your head: the levers you can pull and the order to reach for them. The levers are select (what to include), order (where to place it), compress (shrink what you keep), cache (reuse what's stable), and measure (know if it worked). Most teams reach for 'bigger model' or 'better prompt' first; experienced context engineers reach for select and order first, because they're cheap and high-impact. This mental model is the spine of the rest of the course.

Demo

The toolkit as a priority list, encoded. When answer quality is poor, walk the levers top-down: is the right content selected? Is it ordered well? Only then consider compression, caching, and finally model/prompt changes. The function returns the next lever to try.

Try it yourself

Run the diagnosis on your worst-performing LLM feature and see which lever it points to.
Notice that 'bigger model' is dead last. Resist the urge to jump there — it's the most expensive and least diagnostic.
For each lever, write the name of the module in this course that teaches it (select→M3/M4, order→M3/M9, compress→M7, cache→M8, measure→M10).
Add a sixth diagnostic of your own (e.g. 'context contains contradictions') and decide where it slots in the priority order.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

What are the five core levers of context engineering (select, order, compress, cache, measure) and what does each one do?

2. Why it works (the mechanism)

Explain why 'select' and 'order' should usually be tried before 'bigger model' or 'better prompt' when answer quality is poor.

3. Advanced — application & what's next

Turn the five levers into a production runbook: given a quality complaint, what do I check and in what order, and what evidence tells me to move from one lever to the next?

References

Chat about this lesson

def next_lever(diagnosis: dict) -> str:
    if not diagnosis["right_content_present"]:
        return "SELECT: fix retrieval/inclusion — the evidence isn't even in the window"
    if diagnosis["evidence_buried_in_middle"]:
        return "ORDER: move key evidence to the start/end of context"
    if diagnosis["over_budget"]:
        return "COMPRESS: summarize history / prune low-value blocks"
    if diagnosis["latency_or_cost_too_high"]:
        return "CACHE: cache the stable prefix (system + tools + corpus)"
    if not diagnosis["have_eval"]:
        return "MEASURE: you can't tune what you can't see — build an eval first"
    return "Only now consider a bigger model or prompt rewrite"

print(next_lever({"right_content_present": False, "evidence_buried_in_middle": False,
    "over_budget": False, "latency_or_cost_too_high": False, "have_eval": False}))

Run: python3 main.py

Your context engineering toolkit and mental model