Making a 12-shot AI video today requires writing at least 24 prompts, 12 for image generation, 12 for video generation. Each prompt repeats the character description because the model has no memory. Each shot requires manually tracking what happened in previous shots for continuity. You manage reference images by copying URLs between generation calls. You assemble the result in a separate video editor.
This is imperative video production. You tell the system what to do, step by step, for every shot. It does not scale. It does not reproduce. It does not version-control.
Blueprint filmmaking replaces per-clip prompting with declarative production specifications, JSON files that describe characters, shots, mood, and pacing, then execute as a single pipeline.
What a Blueprint Looks Like
A blueprint describes what the production is, not how to produce it. Here is a simplified example:
{
"title": "THE QUIET COUP",
"target_duration": 150,
"characters": [
{
"char_id": "marcus",
"archetype": "The Falling Titan",
"consistency_prompt": "A man in his 60s, silver hair, bespoke navy suit, eyes showing a mix of exhaustion and terror.",
"iconic_visual_markers": "A slight tremor in his hand as he holds a crystal glass."
},
{
"char_id": "elara",
"archetype": "The Optimized Successor",
"consistency_prompt": "Female, 30s, sharp ethereal features, eyes like polished glass.",
"iconic_visual_markers": "Zero visible pores, a predatory but calm gaze."
}
],
"aesthetic_dna": {
"visual_style": "Corporate Brutalism, cold steel and glass",
"lighting_key": "Cold office blues vs. warm, nervous sweat on skin"
},
"narrative_arc": [
{
"scene": "The Audit",
"description": "Marcus sits in a glass-walled office, reading a termination report.",
"model_tier": "character"
},
{
"scene": "The Boardroom",
"description": "Elara stands at the head of the table, perfectly still.",
"model_tier": "character"
},
{
"scene": "The Server Room",
"description": "Marcus descends into the server room, blue LED light.",
"model_tier": "action"
},
{
"scene": "The Resignation",
"description": "Marcus places his keycard on the desk and walks away.",
"model_tier": "epic"
}
]
}
Notice what the blueprint does not contain: API endpoints, model names, prompt engineering tricks, reference image URLs, FFmpeg commands, or transition timing. Those decisions belong to the engine, not the creator. The blueprint captures creative intent. The pipeline handles execution.
The model_tier field, "character", "action", or "epic", tells the engine what visual fidelity each scene needs. Character scenes prioritize face consistency. Action scenes prioritize motion quality. Epic scenes use the highest-tier generation model. The engine maps these tiers to specific FAL models automatically.
What the Engine Does With a Blueprint
When you run telos run, the engine reads this JSON and executes a multi-stage pipeline. Each stage transforms the blueprint’s declarative intent into concrete production actions.
Character Resolution via Auto-Casting
Marcus is described as “a man in his 60s, silver hair, bespoke navy suit.” The casting director takes this description and generates three reference images, each from a different angle:
- Extreme close-up, shallow depth of field
- Medium close-up, three-quarter view, dramatic rim lighting
- Profile shot, high-contrast chiaroscuro lighting
Gemini 3.1 Pro evaluates all three variants against criteria: grounded hyper-realistic aesthetic, iconic presence, and micro-expression depth. It recommends one. The selected reference is saved to cast/marcus.jpg and injected into every shot prompt that features Marcus.
The terminal output during casting:
--- CASTING CALL: MARCUS ---
Archetype: The Falling Titan
OPTIONS GENERATED IN: projects/quiet-coup/casting
[0] projects/quiet-coup/casting/marcus_opt_0.jpg
[1] projects/quiet-coup/casting/marcus_opt_1.jpg
[2] projects/quiet-coup/casting/marcus_opt_2.jpg
[AUTO] Selected option [1] based on Gemini assessment.
Saved to: projects/quiet-coup/cast/marcus.jpg
To override: replace that file with your preferred option and re-run.
You can override. Replace the auto-selected reference with your own image, re-run, and the engine uses your choice. Casting is a suggestion, not a lock.
Prompt Refinement
The blueprint says: “Marcus sits in a glass-walled office, reading a termination report.” That is 11 words. The prompt refiner transforms it into a 200-word generation prompt:
- Character reference anchors from casting (Marcus’s face, build, clothing)
- Style directives from the
aesthetic_dnablock (Corporate Brutalism, cold office blues) - Continuity context from the previous shot (if this is shot 2+, what came before)
- Profile-specific pacing and mood markers (film profile adds contemplative framing)
- Technical parameters (aspect ratio, model tier mapping)
The creator writes 11 words. The engine constructs 200. The gap between those two numbers is where production automation lives.
Scene Linking
Scene linking extracts the last frame of each generated shot and passes it as a visual anchor to the next, creating continuity between independent AI video generations that have no shared memory.
AI video models do not remember previous generations. Each API call produces an independent clip with no knowledge of what came before. Without intervention, cutting between shots produces jarring discontinuities, colors shift, camera angles reset, environments change between consecutive scenes.
Scene linking compensates. After shot 1 generates, Telos extracts the final frame and passes it as tail_image_url to shot 2’s generation request. Combined with a prompt handshake (“continue from the previous scene”), this gives the model a visual starting point. The result is not perfect temporal coherence, that requires models with actual memory, but it produces noticeably smoother transitions than prompting each shot independently.
Duration-Based Cost Attribution
Each shot in the blueprint carries a duration. An 8-second shot at Kling O3 Standard costs $1.79 (8 × $0.224). The engine calculates this before calling the API. After generation, it logs the actual cost. Post-production, you get a per-shot cost breakdown showing estimate vs. actual.
The Software Engineering Argument
Blueprints are files. This is the most important thing about them.
| Capability | Manual Prompting | Blueprint |
|---|---|---|
| Character consistency | Re-type description in every prompt | Declared once, auto-injected via casting |
| Shot continuity | Remember what happened 6 shots ago | Scene linking extracts and passes last frame |
| Cost visibility | Discover the bill after generating | telos run --plan shows estimates before spending |
| Reproducibility | Screenshot your prompt history | git clone the repo + telos run |
| Iteration | Re-prompt every shot from scratch | Edit the JSON, re-run, unchanged shots are cached |
| Collaboration | Slack screenshots of prompt chains | Pull request on the blueprint file |
| Batch production | Open 10 browser tabs | for bp in blueprints/*.json; do telos run $bp; done |
Version control deserves emphasis. git diff on a blueprint shows exactly what changed between version 1 and version 2 of a production. Changed Marcus’s age from 60 to 55? That is one line in the diff. Swapped the lighting from cold blues to warm amber? That is visible in the aesthetic_dna block. git blame shows who made each change and when.
Reproducibility follows from declarative specification. The same blueprint, run through the same engine version, produces comparable output. Not identical, generative models are stochastic, but structurally comparable. Same characters, same shot count, same pacing, same mood. This is versioned creative work, not ephemeral prompt chains.
Forkability is a consequence of both. Love someone’s blueprint structure? Fork it. Change the characters and setting. Keep the pacing and shot composition. Run it. You get a different production with the same structural DNA.
What Blueprints Cannot Do
Blueprints are JSON. They are not visual, not drag-and-drop, not WYSIWYG. If you cannot read a JSON file, blueprints are inaccessible. A visual blueprint editor would lower this barrier, but one does not exist yet.
The quality of a blueprint depends on the creator’s ability to describe scenes, characters, and mood. The engine executes; it does not create. A vague blueprint (“a guy walks somewhere”) produces a vague production. The engine amplifies creative direction, it does not substitute for it.
Scene linking depends on generation model quality. If Kling produces a shot where the final frame is dark or blurry, the next shot inherits that visual anchor. Garbage in, garbage out, the linking mechanism is only as good as the generated content.
Blueprint format is not standardized. Telos uses its own JSON schema. There is no industry-standard blueprint format for AI video production because the category does not exist yet. If it grows, format interoperability will matter.
Try it. Create a project, open the blueprint JSON, describe two characters and four shots, then run:
telos run --plan myproject
Zero API calls. Zero cost. The engine validates your blueprint, maps characters to casting calls, calculates shot costs, and shows you the full production plan. Edit the blueprint until it looks right, then run it for real.