Composition and framing — camera language for AI prompts

medium

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

The vocabulary of cinematography — wide shot, close-up, over-the-shoulder, rack focus, dolly-in — is a 120-year-old prompt specification language. AI video models have been trained on decades of labeled footage; they know these terms. Saying 'medium close-up, 35mm lens, shallow depth of field, key light from camera left' gives you three orders of magnitude more control than 'a person talking.' This task drills the vocabulary and its effect on generation quality.

Demo

Three axes: shot size (ELS, ES, LS, MLS, MS, MCU, CU, XCU, ECU), angle (eye-level, high, low, Dutch, bird's-eye, POV), and focal length (14mm, 24mm, 35mm, 50mm, 85mm, 135mm). Combining all three precisely gets you the shot you wanted; combining them vaguely gets you a cousin of it. Cinematographers have spent a century associating each combination with emotional tone — leverage that lineage.

# Prompt vocab cheat sheet — drop these into your tool
 
Shot size (distance):
  ELS = extreme long shot (character is tiny, emphasises landscape)
  LS  = long shot (full body with surroundings)
  MLS = medium long shot (knees up)
  MS  = medium shot (waist up — most common for dialogue)
  MCU = medium close-up (chest up)
  CU  = close-up (head and shoulders)
  XCU = extreme close-up (just eyes or mouth)
 
Angle:
  eye-level, high angle (looking down — diminishes subject)
  low angle (looking up — heroic)
  Dutch angle (tilted — unease)
  POV (subject's viewpoint)
 
Focal length:
  14–24mm = wide, distorts, emphasises space
  35mm   = standard documentary/indie look
  50mm   = close to human vision — neutral
  85mm   = portrait, flattering face
  135mm  = telephoto, compressed background
 
Movement:
  locked off, pan, tilt, dolly in/out, tracking, handheld, crane, steadicam
 
Example full prompt:
  "MS, 35mm, eye-level, slow dolly-in on LENA seated by a stone window at sunset.
   Warm key light from window, blue fill. Shallow depth of field. Handheld."

Try it yourself

Generate the same scene with three different shot sizes (WS, MS, CU) and three focal lengths (24mm, 50mm, 85mm). Nine variants. Rank them for intimacy — the correlation with focal length is strong.

Write out the cinematography prompt for each shot in your shot list using the vocabulary above. Aim for 3-5 descriptors per shot.

Watch 5 minutes of a film you love. Pause every 10 seconds. Write down the shot size + angle + approximate focal length. This drills your eye AND your vocabulary.

Read one chapter of 'In the Blink of an Eye' (Walter Murch on editing) — specifically the section on how shot selection affects emotional register. Apply to your shot list.

Avoid the 'cinematic' filler word. Every AI model has been trained on too many 'cinematic' prompts; it rarely changes the output meaningfully. Use specific terms instead.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

In one paragraph, teach me the three axes of cinematic shot prompt vocabulary — shot size, angle, focal length — and why specificity here beats vague 'cinematic' descriptors.

2. Why it works (the mechanism)

Walk me through why a 35mm wide on a subject feels different from an 85mm close-up of the same subject — the optics, the perspective compression, the emotional register — and how AI video models have learned these associations.

3. Advanced — application & what's next

I'm shooting a dialogue scene between two characters: ideally I want coverage (wide, two close-ups, an OTS each direction). Walk me through the prompts for each shot, how I maintain character consistency across them, and how I cut between them in the edit to maintain the 180-degree line.