Skip to main content

Video Prompts

Direct the shot, not the story.

Cinematographic language transfers directly. Shot type, camera motion, lighting, pacing — these are the prompting primitives. The models that respond best are the ones trained on the same visual grammar filmmakers already speak.

The Template

ElementWhat It ControlsExample
Shot typeFraming"Medium close-up"
Camera motionMovement"Slow dolly forward"
Subject actionWhat happens"Woman turns to face camera"
LightingMood and depth"Rim light from behind, deep shadows"
DurationClip length"4 seconds"
StyleVisual treatment"35mm film grain, desaturated"
TransitionHow clips connect"Match cut on hand gesture"

Techniques

Cinematographic Language

Use the vocabulary directors use. Models trained on film metadata respond to it.

"Extreme wide shot. Desert landscape, single figure walking away from camera. Heat haze distortion. Locked tripod. 6 seconds."

Temporal Direction

Describe what changes over time, not a frozen moment.

"Start tight on hands typing. Slowly zoom out to reveal the room is empty except for the desk. Natural light shifts from warm to blue as clouds pass. 8 seconds."

Consistency Across Clips

The hardest problem in AI video. Anchor with specifics.

"Same character throughout: East Asian woman, mid-30s, short black hair, navy coat. Maintain consistent lighting — overcast daylight, no direct sun. Same colour grade — muted teal and orange."

Shot Sequencing

Think in montages, not individual clips.

"Shot 1: Close-up of coffee being poured, steam rising. Shot 2: Wide shot of empty cafe, morning light. Shot 3: Medium shot of barista wiping counter, slow motion. All shots: warm analog palette, shallow depth of field."

The Prompt

A complete 5-shot sequence prompt for a brand film. Uses cinematographic language, temporal direction, and consistency anchors across clips.

You are directing a 30-second brand film for a premium coffee company.
Every shot must feel intentional — this is cinema, not stock footage.

CONSISTENCY ANCHORS (maintain across ALL shots):
- Subject: East Asian woman, mid-30s, short black hair, navy wool coat
- Color grade: Muted teal shadows, warm amber highlights. Desaturated.
- Lighting: Overcast natural daylight. No direct sun. Soft and diffused.
- Film stock: 35mm grain, slight vignette. NOT clean digital.
- Aspect ratio: 2.39:1 (cinematic widescreen)

SHOT 1 — THE DETAIL (4 seconds)
Extreme close-up. Hot coffee being poured into a ceramic cup. Steam
rises in slow motion. Shallow depth of field — only the stream of
liquid is sharp. Sound: quiet pour, ambient cafe murmur.
Camera: locked tripod, no movement.

SHOT 2 — THE SPACE (6 seconds)
Wide establishing shot. Empty cafe interior, morning light streaming
through large windows. Dust particles visible in light shafts. One
table set with the cup from Shot 1. Subject enters frame from right,
walks to the table. Camera: slow dolly forward, barely perceptible.

SHOT 3 — THE MOMENT (5 seconds)
Medium close-up. Subject wraps both hands around the cup. Eyes close
briefly. Micro-expression: contentment, not performance. The first sip.
Camera: handheld with gentle breathing motion. Rack focus from hands
to face.

SHOT 4 — THE WORLD (6 seconds)
Wide shot through cafe window from outside. Rain on glass. Subject
visible inside, soft and warm. Street reflections overlay her image.
Camera: locked, let the rain do the work. Slow zoom from wide to
medium over 6 seconds.

SHOT 5 — THE BRAND (4 seconds)
Close-up of the cup on the table. A hand enters frame and sets down
a saucer. The cup bears a minimal logo. Pull focus to reveal the cafe
name etched in the window behind. Hold for 2 seconds.
Camera: static. Let the composition speak.

TRANSITION NOTES:
- Shot 1→2: Match cut on circular shape (cup rim → window frame)
- Shot 2→3: Jump cut on her sitting motion
- Shot 3→4: Dissolve (interior warmth → exterior cold)
- Shot 4→5: Hard cut (statement ending)

Generate shot-by-shot in Runway, Sora, or Kling. Use consistent subject descriptions across all shots.

Tools

ToolStrengthLink
Runway Gen-3Motion quality, camera controlrunwayml.com
SoraLength, coherence, physical realismopenai.com/sora
KlingMotion fidelity, fast generationklingai.com
PikaQuick iterations, style transferpika.art
Luma Dream Machine3D consistency, camera pathslumalabs.ai
MinimaxLong-form generationminimaxi.com
VeoGoogle's video model, integration with Geminideepmind.google/veo

Context

Motion is meaning.

Questions

If you can describe any shot, does directing become writing?

  • What separates a prompt that generates footage from one that generates cinema?
  • When consistency across clips is solved, what's left that requires a human director?
  • How does temporal direction (things changing over time) differ from spatial composition (things arranged in a frame)?