Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
A hotfix is a patch applied to production to fix a critical bug, on a timeline measured in hours — not sprints. The testing challenge: how do you maintain quality while moving faster than your normal release process allows? The answer is a pre-defined, minimal hotfix test protocol: a regression pack of the smallest set of tests that covers the changed code and its most critical neighbors, a manual smoke test of the affected feature, and a monitoring watch period after deploy. 'We don't have time to test' is never acceptable — but 'we have a 30-minute targeted test protocol for hotfixes' is how responsible teams ship emergency fixes without creating more incidents.
A hotfix test protocol is a pre-defined, scope-limited set of tests that gives enough confidence to deploy a production fix in under two hours: unit tests for the exact changed function, integration tests for the affected endpoint, and a smoke test of the adjacent critical paths that share the most risk with the change. Running the full regression suite during a hotfix is usually wrong — it takes too long and the signal is too diffuse — but running nothing is never acceptable. Defining this protocol before an incident forces the team to think clearly about risk scope rather than improvising under pressure.
import subprocess, sys, time
# Hotfix test protocol — run this instead of the full suite for emergency patches
HOTFIX_CONTEXT = """
Bug: Login returns HTTP 500 when email contains a + character
Fix: Added email normalization in auth/login.py:validate_credentials()
Commit: a3f91bc
"""
# ── Step 1: Unit tests for the exact changed function ──────────────────────
UNIT_TESTS = [
"tests/unit/auth/test_validate_credentials.py::test_email_with_plus_sign",
"tests/unit/auth/test_validate_credentials.py::test_email_normalization",
"tests/unit/auth/test_validate_credentials.py", # full module
]
# ── Step 2: Integration tests for the affected API endpoint ──────────────────
INTEGRATION_TESTS = [
"tests/integration/test_auth_api.py::test_login_success",
"tests/integration/test_auth_api.py::test_login_invalid_credentials",
"tests/integration/test_auth_api.py::test_login_with_special_email_chars",
"tests/integration/test_auth_api.py",
]
# ── Step 3: Smoke tests for adjacent features (regression risk) ──────────────
SMOKE_TESTS = [
"tests/smoke/test_auth_smoke.py",
"tests/smoke/test_user_profile_smoke.py",
]
ALL_HOTFIX_TESTS = UNIT_TESTS + INTEGRATION_TESTS + SMOKE_TESTS
def run_hotfix_protocol():
print(f"HOTFIX TEST PROTOCOL")
print(f"Context: {HOTFIX_CONTEXT.strip()}")
print(f"Running {len(ALL_HOTFIX_TESTS)} targeted test targets\n")
for test_path in ALL_HOTFIX_TESTS:
print(f" pytest {test_path}")
print("\nPost-deploy checklist:")
checklist = [
"✓ Deploy to staging and run smoke tests",
"✓ Manual test: login with user+tag@example.com",
"✓ Check error rate in Datadog/Grafana for 15 min after deploy",
"✓ Check login success rate in analytics for 30 min",
"✓ Notify on-call if error rate > baseline",
"✗ DO NOT run full 2-hour regression suite — deploy to production first, run nightly",
]
for item in checklist: print(f" {item}")
run_hotfix_protocol()python3 main.pygit diff --name-only HEAD~1 | grep test_ | xargs pytest. This is a minimal hotfix CI gate. What are the risks of this approach?Use these three in order. Each builds on the one before.
In one paragraph, explain what a hotfix is and why its testing process is different from a normal release. What's the minimum testing you should do before deploying a hotfix to production?
Walk me through how to select the right test subset for a hotfix: how do you identify which unit tests cover the changed function, which integration tests cover the affected endpoint, and which smoke tests cover the adjacent risk area?
I'm an SDET at a company that ships 3 hotfixes per month. Each one requires a judgment call: how much testing is 'enough' given the severity and the time pressure? Design a hotfix testing matrix: for each severity level (P0 production down, P1 major feature broken, P2 minor bug), specify the test protocol, the maximum time before deploy, the post-deploy monitoring window, and the rollback trigger criteria.