Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
The most expensive mistake in this whole field is fine-tuning when you didn't need to. Fine-tuning costs you a dataset, a training run, an eval harness, and a deployment — weeks of work — and people reach for it reflexively when a better prompt or a retrieval step would have solved the problem in an afternoon. The three tools solve different problems: prompting changes instructions, RAG injects fresh or private knowledge at query time, and fine-tuning bakes behavior into the weights. Knowing which lever to pull before you start is the single highest-leverage skill in this course, because it decides whether you spend an afternoon or a month.
The demo is a decision function, not a model: given what you're actually trying to change (style, format, latency, or new facts), it tells you which tool fits — so you stop reaching for fine-tuning by default.
Use these three in order. Each builds on the one before.
In one paragraph, explain the difference between prompting, RAG, and fine-tuning, like I'm new to it.
Walk me through how I'd decide between fine-tuning, RAG, and prompt engineering for a given task, step by step.
Given a customer-support bot that must match our tone, cite current policy docs, and run cheaply at scale, which combination of prompting, RAG, and fine-tuning would you use and why?
# A decision helper. Run it against your real problem before training anything.
def choose_approach(needs_new_facts, facts_change_often, needs_consistent_style,
needs_strict_format, latency_or_cost_sensitive):
if needs_new_facts and facts_change_often:
return "RAG: inject knowledge at query time; don't bake changing facts into weights"
if needs_consistent_style or needs_strict_format:
return "Fine-tune (LoRA): teach a stable behavior the prompt can't reliably enforce"
if latency_or_cost_sensitive:
return "Fine-tune a SMALL model: distill the behavior so a 3B can replace a 70B"
return "Prompt engineering first: cheapest, fastest, reversible -- exhaust it before training"
print(choose_approach(needs_new_facts=True, facts_change_often=True,
needs_consistent_style=False, needs_strict_format=False,
latency_or_cost_sensitive=False))
print(choose_approach(False, False, needs_consistent_style=True,
needs_strict_format=True, latency_or_cost_sensitive=True))python3 main.py