Capstok — learn by doing

Why this matters

MLOps exists because machine learning systems break the assumptions software engineering is built on. In normal software the behavior is in the code, so versioning the code versions the system. In ML, behavior comes from code plus data plus a trained model plus the live distribution of inputs — change any one and the system behaves differently even though nothing in git moved. That extra dependency on data and a stochastic training process is why you can't just 'deploy and forget' a model, and it's the entire reason this discipline is separate from DevOps. Getting this distinction in your bones is what makes every later practice — versioning data, tracking experiments, monitoring drift — feel necessary rather than ceremonial.

Demo

The demo lays the two lifecycles side by side as a checklist, so you can see exactly which steps software has that ML inherits, and which steps (data versioning, training, eval, drift monitoring) are net-new and have no DevOps equivalent.

Try it yourself

List the inputs that determine your own ML/LLM app's behavior and circle which ones are NOT in git today.
Find one place where your system could change behavior with zero code commits (a data refresh, a model swap, a prompt edit) — that gap is what MLOps closes.
Run can_reproduce with each argument False in turn and note which missing artifact breaks reproducibility.
Write one sentence stating which of code/data/model/distribution your current setup version-controls and which it doesn't.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

In one paragraph, explain how the machine learning lifecycle differs from the normal software development lifecycle, like I'm new to it.

2. Why it works (the mechanism)

Walk me through, step by step, why an ML system's behavior depends on data and training and not just code, and what that means for reproducibility.

3. Advanced — application & what's next

Given an ML system whose accuracy silently dropped with no code change, walk me through the candidate causes that a software-only mental model would miss, and why.

References

Chat about this lesson

# What changes the BEHAVIOR of each kind of system?
software_inputs = {"code"}
ml_inputs = {"code", "training_data", "hyperparams", "random_seed",
             "trained_weights", "live_input_distribution"}

new_in_ml = ml_inputs - software_inputs
print("ML adds these behavior-determining inputs:", sorted(new_in_ml))

# A consequence: 'it worked yesterday' can break with NO code change,
# because training_data or live_input_distribution moved underneath you.
def can_reproduce(have_code, have_data_version, have_seed, have_weights):
    missing = [k for k, v in {
        "code": have_code, "data_version": have_data_version,
        "seed": have_seed, "weights": have_weights}.items() if not v]
    return "reproducible" if not missing else f"NOT reproducible, missing: {missing}"

print(can_reproduce(True, False, True, True))   # missing data_version -> not reproducible

Run: python3 main.py

The ML lifecycle vs. the software lifecycle