Video Prompts

Direct the shot, not the story.

Cinematographic language transfers directly. Shot type, camera motion, lighting, pacing — these are the prompting primitives. The models that respond best are the ones trained on the same visual grammar filmmakers already speak.

The Template

Element	What It Controls	Example
Shot type	Framing	"Medium close-up"
Camera motion	Movement	"Slow dolly forward"
Subject action	What happens	"Woman turns to face camera"
Lighting	Mood and depth	"Rim light from behind, deep shadows"
Duration	Clip length	"4 seconds"
Style	Visual treatment	"35mm film grain, desaturated"
Transition	How clips connect	"Match cut on hand gesture"

Techniques

Cinematographic Language

Use the vocabulary directors use. Models trained on film metadata respond to it.

"Extreme wide shot. Desert landscape, single figure walking away from camera. Heat haze distortion. Locked tripod. 6 seconds."

Temporal Direction

Describe what changes over time, not a frozen moment.

"Start tight on hands typing. Slowly zoom out to reveal the room is empty except for the desk. Natural light shifts from warm to blue as clouds pass. 8 seconds."

Consistency Across Clips

The hardest problem in AI video. Anchor with specifics.

"Same character throughout: East Asian woman, mid-30s, short black hair, navy coat. Maintain consistent lighting — overcast daylight, no direct sun. Same colour grade — muted teal and orange."

Shot Sequencing

Think in montages, not individual clips.

"Shot 1: Close-up of coffee being poured, steam rising. Shot 2: Wide shot of empty cafe, morning light. Shot 3: Medium shot of barista wiping counter, slow motion. All shots: warm analog palette, shallow depth of field."

The Prompt

A complete 5-shot sequence prompt for a brand film. Uses cinematographic language, temporal direction, and consistency anchors across clips.

You are directing a 30-second brand film for a premium coffee company.
Every shot must feel intentional — this is cinema, not stock footage.

CONSISTENCY ANCHORS (maintain across ALL shots):
- Subject: East Asian woman, mid-30s, short black hair, navy wool coat
- Color grade: Muted teal shadows, warm amber highlights. Desaturated.
- Lighting: Overcast natural daylight. No direct sun. Soft and diffused.
- Film stock: 35mm grain, slight vignette. NOT clean digital.
- Aspect ratio: 2.39:1 (cinematic widescreen)

SHOT 1 — THE DETAIL (4 seconds)
Extreme close-up. Hot coffee being poured into a ceramic cup. Steam
rises in slow motion. Shallow depth of field — only the stream of
liquid is sharp. Sound: quiet pour, ambient cafe murmur.
Camera: locked tripod, no movement.

SHOT 2 — THE SPACE (6 seconds)
Wide establishing shot. Empty cafe interior, morning light streaming
through large windows. Dust particles visible in light shafts. One
table set with the cup from Shot 1. Subject enters frame from right,
walks to the table. Camera: slow dolly forward, barely perceptible.

SHOT 3 — THE MOMENT (5 seconds)
Medium close-up. Subject wraps both hands around the cup. Eyes close
briefly. Micro-expression: contentment, not performance. The first sip.
Camera: handheld with gentle breathing motion. Rack focus from hands
to face.

SHOT 4 — THE WORLD (6 seconds)
Wide shot through cafe window from outside. Rain on glass. Subject
visible inside, soft and warm. Street reflections overlay her image.
Camera: locked, let the rain do the work. Slow zoom from wide to
medium over 6 seconds.

SHOT 5 — THE BRAND (4 seconds)
Close-up of the cup on the table. A hand enters frame and sets down
a saucer. The cup bears a minimal logo. Pull focus to reveal the cafe
name etched in the window behind. Hold for 2 seconds.
Camera: static. Let the composition speak.

TRANSITION NOTES:
- Shot 1→2: Match cut on circular shape (cup rim → window frame)
- Shot 2→3: Jump cut on her sitting motion
- Shot 3→4: Dissolve (interior warmth → exterior cold)
- Shot 4→5: Hard cut (statement ending)

Generate shot-by-shot in Runway, Sora, or Kling. Use consistent subject descriptions across all shots.

Tools

Tool	Strength	Link
Runway Gen-3	Motion quality, camera control	runwayml.com
Sora	Length, coherence, physical realism	openai.com/sora
Kling	Motion fidelity, fast generation	klingai.com
Pika	Quick iterations, style transfer	pika.art
Luma Dream Machine	3D consistency, camera paths	lumalabs.ai
Minimax	Long-form generation	minimaxi.com
Veo	Google's video model, integration with Gemini	deepmind.google/veo

Context

Video — Video tool capabilities and comparison
Visual Prompting — Image prompt principles that transfer to video
Visual Art Prompts — Composition and style techniques
Prompts — First principles across all modalities

Mantra

Motion is meaning.

Questions

If you can describe any shot, does directing become writing?

What separates a prompt that generates footage from one that generates cinema?
When consistency across clips is solved, what's left that requires a human director?
How does temporal direction (things changing over time) differ from spatial composition (things arranged in a frame)?

The Template​

Techniques​

Cinematographic Language​

Temporal Direction​

Consistency Across Clips​

Shot Sequencing​

The Prompt​

Tools​

Context​

Questions​