Visual Prompting
What separates a prompt that returns generic stock art from one that returns exactly what you pictured?

Structure. The same structure that separates a wishlist from a blueprint.
The Problem
Most visual prompts fail the same way: too vague on layout, too specific on adjectives. "A beautiful futuristic dashboard with glowing elements" returns something. It never returns what you needed.
| Prompt Failure | What Happens | Root Cause |
|---|---|---|
| No spatial layout | AI guesses composition | You described WHAT, not WHERE |
| No colour constraints | Random palette | You left the biggest visual decision to chance |
| Style + content mixed | Incoherent output | Two different instructions fighting |
| No negative prompts | Unwanted elements appear | You only said what you want, not what you don't |
| Text in the image | Garbled letterforms | Current models can't reliably render text |
The Template
Fill in each section. Skip nothing. Order matters.
[1. FORMAT]
Aspect ratio, orientation, resolution intent.
[2. WHAT — Subject and composition]
The thing itself. Spatial layout. What goes where.
[3. STYLE — Visual language]
Aesthetic reference. Art direction. Period, medium, influence.
[4. COLOUR — Palette]
Specific hex codes or named palette. Background, foreground, accent.
[5. TYPOGRAPHY — Text handling]
What text MUST appear. What text to add in post-production instead.
[6. MOOD — Feeling]
What emotion does the viewer experience? One sentence.
[7. NOT — Negative prompts]
What to exclude. Be specific about common failure modes.
[8. PARAMS — Model-specific flags]
Midjourney: --ar, --style, --s, --v
Stable Diffusion: negative prompt, CFG scale, steps
DALL-E: style parameter
Section by Section
1. Format
State this first. It constrains everything.
| Format | When to Use |
|---|---|
16:9 landscape | Presentations, hero images, diagrams |
9:16 portrait | Mobile, social stories |
1:1 square | Social posts, avatars |
4:5 portrait | Instagram feed |
21:9 ultrawide | Cinematic, banners |
2. Subject
Describe composition like a stage direction, not a wish. Name the spatial zones.
| Weak | Strong |
|---|---|
| "A diagram showing business concepts" | "Three-column triptych. Left column: five stacked horizontal bars. Centre: ten nodes in a clockwise ring with directional arrows. Right: single block with text above." |
| "Some shapes connected together" | "Rounded rectangles with thin 1px borders arranged in a circle, connected by directional arrows forming continuous clockwise flow." |
Use concrete spatial language: top-left, centre, bottom third, foreground, background. Name how many elements exist. State their relationships.
3. Style
Reference real things. "Clean" means nothing to a model. "Edward Tufte's information density meets Dieter Rams' minimalism" means everything.
| Reference Type | Example |
|---|---|
| Designer/artist | "Massimo Vignelli's grid discipline" |
| Medium | "Screen-printed poster, limited colour run" |
| Era | "Swiss International Style, 1960s" |
| Product | "Bloomberg terminal aesthetic" |
| Combination | "Apple keynote data visualisation meets transit map clarity" |
4. Colour
Never leave colour to chance. Three minimum: background, primary, accent.
Background: #111827 (near-black blue-gray)
Text/primary: #F9FAFB (warm white)
Accent: #7C3AED (purple — used sparingly, POSITION blocks only)
Secondary accent: #B91C1C (crimson — feedback chain)
Muted: #D1D5DB (gray — labels, secondary text)
Name where each colour appears. "Purple for accent" is vague. "Purple #7C3AED for the two POSITION blocks only" gives the model a constraint it can follow.
5. Typography
Current AI cannot reliably render text. Plan for this.
| Strategy | When |
|---|---|
| No text in image | Safest. Add all text in Figma/Canva after. |
| Minimal text | 1-3 short labels. Works in Ideogram, unreliable in Midjourney. |
| Text-heavy | Generate layout only. Overlay all text in post. |
If you need text, state it explicitly: Label reads exactly: "PROTOCOLS". Still expect to fix it manually.
6. Mood
One sentence. Not a list of adjectives — a feeling the viewer should have.
| Weak | Strong |
|---|---|
| "Professional, modern, clean, sleek" | "The clarity of a well-designed transit map." |
| "Inspiring and motivational" | "Forward momentum. A control room coming online." |
7. Negative Prompts
Tell the model what to exclude. Be specific about the failure modes of your category.
For information design:
NOT: clipart, generic corporate, motivational poster, 3D renders,
gradients everywhere, drop shadows, glossy effects, stock photo elements,
decorative borders, watercolor, hand-drawn sketch, people, faces.
For product photography:
NOT: floating objects, impossible reflections, extra fingers,
blurred text, watermarks, collage layouts.
Match your negatives to the category. Different subjects have different failure modes.
8. Model Parameters
Each tool has specific flags that matter.
| Tool | Key Parameters |
|---|---|
| Midjourney | --ar 16:9 aspect ratio, --style raw less artistic interpretation, --s 50 low stylisation, --v 7 model version |
| DALL-E | style: natural or style: vivid, size specification |
| Flux/SD | CFG scale (7-12 typical), steps (20-50), negative prompt field |
| Ideogram | Aspect ratio, magic prompt on/off (off for precise control) |
--style raw in Midjourney is critical for diagrams and information design. Without it, the model adds artistic embellishment that fights your layout.