A Blueprint, Not a Prompt: Why Declarative Beats Imperative in AI Video

bottom line
  • A 12-shot video requires 24+ manual prompts with repeated character descriptions, blueprints describe the production once
  • Blueprints are JSON: version-controllable, diffable, forkable, batch-executable
  • Scene linking passes the last frame of each shot to the next, creating visual continuity without model memory
  • Auto-casting resolves character descriptions into reference images, injected into every relevant shot
  • telos run --plan validates blueprints and estimates costs without calling any API

Making a 12-shot AI video today requires writing at least 24 prompts, 12 for image generation, 12 for video generation. Each prompt repeats the character description because the model has no memory. Each shot requires manually tracking what happened in previous shots for continuity. You manage reference images by copying URLs between generation calls. You assemble the result in a separate video editor.

This is imperative video production. You tell the system what to do, step by step, for every shot. It does not scale. It does not reproduce. It does not version-control.

Blueprint filmmaking replaces per-clip prompting with declarative production specifications, JSON files that describe characters, shots, mood, and pacing, then execute as a single pipeline.

What a Blueprint Looks Like

A blueprint describes what the production is, not how to produce it. Here is a simplified example:

{
  "title": "THE QUIET COUP",
  "target_duration": 150,
  "characters": [
    {
      "char_id": "marcus",
      "archetype": "The Falling Titan",
      "consistency_prompt": "A man in his 60s, silver hair, bespoke navy suit, eyes showing a mix of exhaustion and terror.",
      "iconic_visual_markers": "A slight tremor in his hand as he holds a crystal glass."
    },
    {
      "char_id": "elara",
      "archetype": "The Optimized Successor",
      "consistency_prompt": "Female, 30s, sharp ethereal features, eyes like polished glass.",
      "iconic_visual_markers": "Zero visible pores, a predatory but calm gaze."
    }
  ],
  "aesthetic_dna": {
    "visual_style": "Corporate Brutalism, cold steel and glass",
    "lighting_key": "Cold office blues vs. warm, nervous sweat on skin"
  },
  "narrative_arc": [
    {
      "scene": "The Audit",
      "description": "Marcus sits in a glass-walled office, reading a termination report.",
      "model_tier": "character"
    },
    {
      "scene": "The Boardroom",
      "description": "Elara stands at the head of the table, perfectly still.",
      "model_tier": "character"
    },
    {
      "scene": "The Server Room",
      "description": "Marcus descends into the server room, blue LED light.",
      "model_tier": "action"
    },
    {
      "scene": "The Resignation",
      "description": "Marcus places his keycard on the desk and walks away.",
      "model_tier": "epic"
    }
  ]
}

Notice what the blueprint does not contain: API endpoints, model names, prompt engineering tricks, reference image URLs, FFmpeg commands, or transition timing. Those decisions belong to the engine, not the creator. The blueprint captures creative intent. The pipeline handles execution.

The model_tier field, "character", "action", or "epic", tells the engine what visual fidelity each scene needs. Character scenes prioritize face consistency. Action scenes prioritize motion quality. Epic scenes use the highest-tier generation model. The engine maps these tiers to specific FAL models automatically.

What the Engine Does With a Blueprint

When you run telos run, the engine reads this JSON and executes a multi-stage pipeline. Each stage transforms the blueprint’s declarative intent into concrete production actions.

Character Resolution via Auto-Casting

Marcus is described as “a man in his 60s, silver hair, bespoke navy suit.” The casting director takes this description and generates three reference images, each from a different angle:

  1. Extreme close-up, shallow depth of field
  2. Medium close-up, three-quarter view, dramatic rim lighting
  3. Profile shot, high-contrast chiaroscuro lighting

Gemini 3.1 Pro evaluates all three variants against criteria: grounded hyper-realistic aesthetic, iconic presence, and micro-expression depth. It recommends one. The selected reference is saved to cast/marcus.jpg and injected into every shot prompt that features Marcus.

The terminal output during casting:

--- CASTING CALL: MARCUS ---
Archetype: The Falling Titan

OPTIONS GENERATED IN: projects/quiet-coup/casting
[0] projects/quiet-coup/casting/marcus_opt_0.jpg
[1] projects/quiet-coup/casting/marcus_opt_1.jpg
[2] projects/quiet-coup/casting/marcus_opt_2.jpg

[AUTO] Selected option [1] based on Gemini assessment.
       Saved to: projects/quiet-coup/cast/marcus.jpg
       To override: replace that file with your preferred option and re-run.

You can override. Replace the auto-selected reference with your own image, re-run, and the engine uses your choice. Casting is a suggestion, not a lock.

Prompt Refinement

The blueprint says: “Marcus sits in a glass-walled office, reading a termination report.” That is 11 words. The prompt refiner transforms it into a 200-word generation prompt:

  • Character reference anchors from casting (Marcus’s face, build, clothing)
  • Style directives from the aesthetic_dna block (Corporate Brutalism, cold office blues)
  • Continuity context from the previous shot (if this is shot 2+, what came before)
  • Profile-specific pacing and mood markers (film profile adds contemplative framing)
  • Technical parameters (aspect ratio, model tier mapping)

The creator writes 11 words. The engine constructs 200. The gap between those two numbers is where production automation lives.

Scene Linking

Scene linking extracts the last frame of each generated shot and passes it as a visual anchor to the next, creating continuity between independent AI video generations that have no shared memory.

AI video models do not remember previous generations. Each API call produces an independent clip with no knowledge of what came before. Without intervention, cutting between shots produces jarring discontinuities, colors shift, camera angles reset, environments change between consecutive scenes.

Scene linking compensates. After shot 1 generates, Telos extracts the final frame and passes it as tail_image_url to shot 2’s generation request. Combined with a prompt handshake (“continue from the previous scene”), this gives the model a visual starting point. The result is not perfect temporal coherence, that requires models with actual memory, but it produces noticeably smoother transitions than prompting each shot independently.

Duration-Based Cost Attribution

Each shot in the blueprint carries a duration. An 8-second shot at Kling O3 Standard costs $1.79 (8 × $0.224). The engine calculates this before calling the API. After generation, it logs the actual cost. Post-production, you get a per-shot cost breakdown showing estimate vs. actual.

The Software Engineering Argument

Blueprints are files. This is the most important thing about them.

CapabilityManual PromptingBlueprint
Character consistencyRe-type description in every promptDeclared once, auto-injected via casting
Shot continuityRemember what happened 6 shots agoScene linking extracts and passes last frame
Cost visibilityDiscover the bill after generatingtelos run --plan shows estimates before spending
ReproducibilityScreenshot your prompt historygit clone the repo + telos run
IterationRe-prompt every shot from scratchEdit the JSON, re-run, unchanged shots are cached
CollaborationSlack screenshots of prompt chainsPull request on the blueprint file
Batch productionOpen 10 browser tabsfor bp in blueprints/*.json; do telos run $bp; done

Version control deserves emphasis. git diff on a blueprint shows exactly what changed between version 1 and version 2 of a production. Changed Marcus’s age from 60 to 55? That is one line in the diff. Swapped the lighting from cold blues to warm amber? That is visible in the aesthetic_dna block. git blame shows who made each change and when.

Reproducibility follows from declarative specification. The same blueprint, run through the same engine version, produces comparable output. Not identical, generative models are stochastic, but structurally comparable. Same characters, same shot count, same pacing, same mood. This is versioned creative work, not ephemeral prompt chains.

Forkability is a consequence of both. Love someone’s blueprint structure? Fork it. Change the characters and setting. Keep the pacing and shot composition. Run it. You get a different production with the same structural DNA.

What Blueprints Cannot Do

Blueprints are JSON. They are not visual, not drag-and-drop, not WYSIWYG. If you cannot read a JSON file, blueprints are inaccessible. A visual blueprint editor would lower this barrier, but one does not exist yet.

The quality of a blueprint depends on the creator’s ability to describe scenes, characters, and mood. The engine executes; it does not create. A vague blueprint (“a guy walks somewhere”) produces a vague production. The engine amplifies creative direction, it does not substitute for it.

Scene linking depends on generation model quality. If Kling produces a shot where the final frame is dark or blurry, the next shot inherits that visual anchor. Garbage in, garbage out, the linking mechanism is only as good as the generated content.

Blueprint format is not standardized. Telos uses its own JSON schema. There is no industry-standard blueprint format for AI video production because the category does not exist yet. If it grows, format interoperability will matter.


Try it. Create a project, open the blueprint JSON, describe two characters and four shots, then run:

telos run --plan myproject

Zero API calls. Zero cost. The engine validates your blueprint, maps characters to casting calls, calculates shot costs, and shows you the full production plan. Edit the blueprint until it looks right, then run it for real.

scope the right production lane.

[ production brief ]

deliverable, timeline, review owner, and reference work determine the correct Telos path.

topics
blueprint-filmmakingdeclarative-videoai-videoscene-linkingauto-casting