What is a blueprint in AI video production?

A blueprint is a JSON file that declaratively describes a video production, characters, shots, mood, pacing, and audio, without specifying API calls, model parameters, or prompt engineering. The engine interprets and executes the blueprint as a multi-stage pipeline.

How does scene linking work in Telos Engine?

After each shot generates, Telos extracts the final frame and passes it as tail_image_url to the next shot's generation request. Combined with continuity-aware prompt enrichment, this creates visual flow between shots without the video model needing memory.

Can I version-control AI video productions?

Yes. Blueprints are JSON files. You can git diff them, git blame them, branch them, fork them, and share them via pull requests. The blueprint is the source of truth, the engine produces comparable output from the same blueprint.

A Blueprint, Not a Prompt: Why Declarative Beats Imperative in AI Video // telos engine

Making a 12-shot AI video today requires writing at least 24 prompts, 12 for image generation, 12 for video generation. Each prompt repeats the character description because the model has no memory. Each shot requires manually tracking what happened in previous shots for continuity. You manage reference images by copying URLs between generation calls. You assemble the result in a separate video editor.

This is imperative video production. You tell the system what to do, step by step, for every shot. It does not scale. It does not reproduce. It does not version-control.

Blueprint filmmaking replaces per-clip prompting with declarative production specifications, JSON files that describe characters, shots, mood, and pacing, then execute as a single pipeline.

What a Blueprint Looks Like

A blueprint describes what the production is, not how to produce it. Here is a simplified example:

{
  "title": "THE QUIET COUP",
  "target_duration": 150,
  "characters": [
    {
      "char_id": "marcus",
      "archetype": "The Falling Titan",
      "consistency_prompt": "A man in his 60s, silver hair, bespoke navy suit, eyes showing a mix of exhaustion and terror.",
      "iconic_visual_markers": "A slight tremor in his hand as he holds a crystal glass."
    },
    {
      "char_id": "elara",
      "archetype": "The Optimized Successor",
      "consistency_prompt": "Female, 30s, sharp ethereal features, eyes like polished glass.",
      "iconic_visual_markers": "Zero visible pores, a predatory but calm gaze."
    }
  ],
  "aesthetic_dna": {
    "visual_style": "Corporate Brutalism, cold steel and glass",
    "lighting_key": "Cold office blues vs. warm, nervous sweat on skin"
  },
  "narrative_arc": [
    {
      "scene": "The Audit",
      "description": "Marcus sits in a glass-walled office, reading a termination report.",
      "model_tier": "character"
    },
    {
      "scene": "The Boardroom",
      "description": "Elara stands at the head of the table, perfectly still.",
      "model_tier": "character"
    },
    {
      "scene": "The Server Room",
      "description": "Marcus descends into the server room, blue LED light.",
      "model_tier": "action"
    },
    {
      "scene": "The Resignation",
      "description": "Marcus places his keycard on the desk and walks away.",
      "model_tier": "epic"
    }
  ]
}

Notice what the blueprint does not contain: API endpoints, model names, prompt engineering tricks, reference image URLs, FFmpeg commands, or transition timing. Those decisions belong to the engine, not the creator. The blueprint captures creative intent. The pipeline handles execution.

The model_tier field, "character", "action", or "epic", tells the engine what visual fidelity each scene needs. Character scenes prioritize face consistency. Action scenes prioritize motion quality. Epic scenes use the highest-tier generation model. The engine maps these tiers to specific FAL models automatically.

What the Engine Does With a Blueprint

When you run telos run, the engine reads this JSON and executes a multi-stage pipeline. Each stage transforms the blueprint’s declarative intent into concrete production actions.

Character Resolution via Auto-Casting

Marcus is described as “a man in his 60s, silver hair, bespoke navy suit.” The casting director takes this description and generates three reference images, each from a different angle:

Extreme close-up, shallow depth of field
Medium close-up, three-quarter view, dramatic rim lighting
Profile shot, high-contrast chiaroscuro lighting

Gemini 3.1 Pro evaluates all three variants against criteria: grounded hyper-realistic aesthetic, iconic presence, and micro-expression depth. It recommends one. The selected reference is saved to cast/marcus.jpg and injected into every shot prompt that features Marcus.

The terminal output during casting:

--- CASTING CALL: MARCUS ---
Archetype: The Falling Titan

OPTIONS GENERATED IN: projects/quiet-coup/casting
[0] projects/quiet-coup/casting/marcus_opt_0.jpg
[1] projects/quiet-coup/casting/marcus_opt_1.jpg
[2] projects/quiet-coup/casting/marcus_opt_2.jpg

[AUTO] Selected option [1] based on Gemini assessment.
       Saved to: projects/quiet-coup/cast/marcus.jpg
       To override: replace that file with your preferred option and re-run.

You can override. Replace the auto-selected reference with your own image, re-run, and the engine uses your choice. Casting is a suggestion, not a lock.

The blueprint says: “Marcus sits in a glass-walled office, reading a termination report.” That is 11 words. The prompt refiner transforms it into a 200-word generation prompt:

Character reference anchors from casting (Marcus’s face, build, clothing)
Style directives from the aesthetic_dna block (Corporate Brutalism, cold office blues)
Continuity context from the previous shot (if this is shot 2+, what came before)
Profile-specific pacing and mood markers (film profile adds contemplative framing)
Technical parameters (aspect ratio, model tier mapping)

The creator writes 11 words. The engine constructs 200. The gap between those two numbers is where production automation lives.

Scene Linking

Scene linking extracts the last frame of each generated shot and passes it as a visual anchor to the next, creating continuity between independent AI video generations that have no shared memory.

AI video models do not remember previous generations. Each API call produces an independent clip with no knowledge of what came before. Without intervention, cutting between shots produces jarring discontinuities, colors shift, camera angles reset, environments change between consecutive scenes.

Scene linking compensates. After shot 1 generates, Telos extracts the final frame and passes it as tail_image_url to shot 2’s generation request. Combined with a prompt handshake (“continue from the previous scene”), this gives the model a visual starting point. The result is not perfect temporal coherence, that requires models with actual memory, but it produces noticeably smoother transitions than prompting each shot independently.

Duration-Based Cost Attribution

Each shot in the blueprint carries a duration. An 8-second shot at Kling O3 Standard costs $1.79 (8 × $0.224). The engine calculates this before calling the API. After generation, it logs the actual cost. Post-production, you get a per-shot cost breakdown showing estimate vs. actual.

The Software Engineering Argument

Blueprints are files. This is the most important thing about them.

Capability	Manual Prompting	Blueprint
Character consistency	Re-type description in every prompt	Declared once, auto-injected via casting
Shot continuity	Remember what happened 6 shots ago	Scene linking extracts and passes last frame
Cost visibility	Discover the bill after generating	`telos run --plan` shows estimates before spending
Reproducibility	Screenshot your prompt history	`git clone` the repo + `telos run`
Iteration	Re-prompt every shot from scratch	Edit the JSON, re-run, unchanged shots are cached
Collaboration	Slack screenshots of prompt chains	Pull request on the blueprint file
Batch production	Open 10 browser tabs	`for bp in blueprints/*.json; do telos run $bp; done`

Version control deserves emphasis. git diff on a blueprint shows exactly what changed between version 1 and version 2 of a production. Changed Marcus’s age from 60 to 55? That is one line in the diff. Swapped the lighting from cold blues to warm amber? That is visible in the aesthetic_dna block. git blame shows who made each change and when.

Reproducibility follows from declarative specification. The same blueprint, run through the same engine version, produces comparable output. Not identical, generative models are stochastic, but structurally comparable. Same characters, same shot count, same pacing, same mood. This is versioned creative work, not ephemeral prompt chains.

Forkability is a consequence of both. Love someone’s blueprint structure? Fork it. Change the characters and setting. Keep the pacing and shot composition. Run it. You get a different production with the same structural DNA.

What Blueprints Cannot Do

Blueprints are JSON. They are not visual, not drag-and-drop, not WYSIWYG. If you cannot read a JSON file, blueprints are inaccessible. A visual blueprint editor would lower this barrier, but one does not exist yet.

The quality of a blueprint depends on the creator’s ability to describe scenes, characters, and mood. The engine executes; it does not create. A vague blueprint (“a guy walks somewhere”) produces a vague production. The engine amplifies creative direction, it does not substitute for it.

Scene linking depends on generation model quality. If Kling produces a shot where the final frame is dark or blurry, the next shot inherits that visual anchor. Garbage in, garbage out, the linking mechanism is only as good as the generated content.

Blueprint format is not standardized. Telos uses its own JSON schema. There is no industry-standard blueprint format for AI video production because the category does not exist yet. If it grows, format interoperability will matter.

Try it. Create a project, open the blueprint JSON, describe two characters and four shots, then run:

telos run --plan myproject

Zero API calls. Zero cost. The engine validates your blueprint, maps characters to casting calls, calculates shot costs, and shows you the full production plan. Edit the blueprint until it looks right, then run it for real.

A Blueprint, Not a Prompt: Why Declarative Beats Imperative in AI Video