AI video tools generate clips. A clip is not a film.
Making a multi-shot video with current tools looks like this: prompt shot 1 in Runway, download. Prompt shot 2, different character appearance, because the model has no memory. Download. Prompt shot 3, realize the lighting shifted. Re-prompt. Download. Open a video editor. Drag clips onto a timeline. Add transitions. Find background audio. Sync it. Export. Repeat for every production.
Telos Engine runs a six-stage production pipeline, cast, refine, shoot, compose, review, deliver, producing a finished multi-shot video from a single CLI command.
The problem was never generation quality. Runway, Pika, and Kling produce good clips. The problem is production orchestration, the 90% of work that happens around generation.
Six Stages, One Pipeline
Physical filmmaking follows a predictable sequence: cast actors, develop the script, shoot footage, edit it together, review the cut, deliver the final print. Telos follows the same sequence, automated.
Here is what happens when you run telos run on a film blueprint:
Stage 1: Cast
Before generating a single frame, Telos creates reference images for each character in the blueprint. The casting director generates three reference variants per character, an extreme close-up, a three-quarter view with rim lighting, and a profile shot with chiaroscuro lighting. Gemini 3.1 Pro evaluates all three for consistency, presence, and aesthetic fit, then selects the strongest reference.
Every shot that features that character inherits this reference image. When the visual engine generates shot 7 and shot 14, both featuring the same character, they anchor to the same casting reference. This is how consistency works across shots without the video model needing memory.
Stage 2: Refine
Raw shot descriptions from the blueprint are sparse: “Marcus sits in a glass-walled office, reading a termination report.” The prompt refiner enriches each shot with scene context, character anchors from casting, style directives from the production profile, and continuity markers from the previous shot. A 15-word description becomes a 200-word generation prompt calibrated for the visual model.
Stage 3: Shoot
Image and video generation via FAL. Each shot gets a keyframe image (Flux 2 Pro), then a video clip (Kling O3, 5-10 seconds per shot). Scene linking extracts the last frame of each completed shot and passes it as a reference to the next, creating visual flow between independent generations.
If a shot fails (API error, content filter, timeout), the engine does not abort. It logs the failure, adjusts the prompt, and retries. If the retry fails, production continues with the remaining shots. A 17-shot production delivers 16 shots instead of crashing at shot 8.
Stage 4: Compose
FFmpeg assembles the shot clips into a continuous video. The film profile applies slow crossfade transitions (0.5 seconds), layers native audio from Kling’s generation, adds a 4-second fade to black, and inserts sequence markers between acts. Other profiles handle composition differently, the ads profile adds a CTA end card, the social profile applies word-by-word caption overlays.
Stage 5: Review
The production reviewer (Gemini 3 Flash) scores the assembled output against quality criteria: visual consistency, narrative coherence, pacing, and technical quality. When running with --variants 3, the reviewer auto-selects the best variant per shot.
Stage 6: Deliver
Final export to the output directory with metadata: cost breakdown per shot, quality scores, production log, and the blueprint that generated it. Every production is traceable and reproducible.
Five Commands, Full Production
pipx install telos-engine
telos doctor
telos new alchemist --profile film
telos run --plan alchemist
telos run alchemist
telos doctor checks nine dependencies: Python version (3.10+), FFmpeg, FFprobe, audio codecs, API keys (Gemini, FAL), license status, disk space (5 GB minimum), and required Python packages. Every check shows OK or FAIL with a fix instruction.
telos new creates a project directory with the selected profile configuration. The --profile flag accepts six production types: film, clip, ads, social, explainer, documentary. Each profile tunes the entire pipeline, shot count, pacing, transitions, audio treatment, and review criteria.
telos run --plan previews the pipeline without executing. It shows every step, its status (done or pending), and the estimated cost:
Pipeline Plan
┌─────┬──────────────────────┬─────────┬───────────┐
│ # │ Step │ Status │ Est. Cost │
├─────┼──────────────────────┼─────────┼───────────┤
│ 1 │ Film Brief │ Pending │ $0.01 │
│ 2 │ Casting │ Pending │ $0.30 │
│ 3 │ Shot Planning │ Pending │ $0.01 │
│ 4 │ Visual Generation │ Pending │ $17.10 │
│ 5 │ Assembly │ Pending │ -- │
│ 6 │ Final Export │ Pending │ -- │
└─────┴──────────────────────┴─────────┴───────────┘
Summary
──────────────────────────────────────────────
Profile: Film Maker
Steps: 7
Est. shots: 15
Est. cost: $17.42
Image: $0.045/shot | Video: $1.12/shot (5s @ $0.224/s)
──────────────────────────────────────────────
No credits. No obfuscation. Per-shot cost attribution with the model and rate visible.
telos run executes the full pipeline. Progress prints to the terminal:
Running pipeline for: alchemist
Profile: Film Maker
[1/7] Film Brief
Brief created with 2 characters
[2/7] Casting
marcus cast successfully.
elara cast successfully.
[3/7] Shot Planning
Shot list created: 17 shots
[4/7] Visual Generation
Generating shot 0/16...
Shot 0 complete.
Generating shot 1/16...
...
[5/7] Assembly
Silent assembly complete.
Final video with audio.
[6/7] Final Export
Final export: projects/alchemist/output/alchemist_final.mp4
The pipeline is resume-safe. Every step checks completion status before executing. If generation fails at shot 12, re-running telos run skips shots 0-11 and picks up at 12. No wasted API spend.
After completion, a summary panel:
Pipeline Complete
────────────────
Steps: 7
Shots: 17
Errors: 0
Time: 847.3s
Output: projects/alchemist/output/alchemist_final.mp4
Est. Cost: $35.12
What $35 Buys
A 17-shot, 150-second art-house film costs approximately $35 in API fees: $0.77 for image generation (Flux 2 Pro at $0.045/image), $33.60 for video generation (Kling O3 Standard at $0.224/second), and $0.50 for creative direction (Gemini).
The cost breakdown:
| Component | Model | Rate | Units | Cost |
|---|---|---|---|---|
| Keyframe images | Flux 2 Pro | $0.045/image | 17 images | $0.77 |
| Video generation | Kling O3 Standard | $0.224/second | 150 seconds | $33.60 |
| Creative direction | Gemini 3.1 Pro | ~$0.50/production | 1 production | $0.50 |
| Composition | FFmpeg | $0 (local) | , | $0.00 |
| Total | $34.87 |
Video generation is 96% of the cost. Image generation is negligible. Creative direction (Gemini for casting assessment and prompt refinement) rounds to $0.50.
For comparison, other production types:
| Production | Profile | Duration | Est. Cost |
|---|---|---|---|
| Art-house short film | film | 150s | ~$35 |
| Viral content clip | clip | 75s | ~$18 |
| Product advertisement | ads | 30s | ~$7.40 |
| Explainer video | explainer | 90s | ~$21 |
These are API costs, what you pay Flux and Kling directly through your own API keys (BYOK architecture). Telos does not mark up API calls or take a percentage.
What This Changes
A blueprint is a JSON file. It describes characters, shots, mood, pacing, and visual style. It does not contain API calls, model parameters, or prompt engineering.
{
"title": "THE QUIET COUP",
"target_duration": 150,
"characters": [
{
"char_id": "marcus",
"archetype": "The Falling Titan",
"consistency_prompt": "A man in his 60s, silver hair, bespoke navy suit..."
}
],
"narrative_arc": [
{
"scene": "The Audit",
"description": "Marcus sits in a glass-walled office, reading a report..."
}
]
}
Because blueprints are files:
- Version control.
git diffa blueprint. See what changed between v1 and v2 of a production. - Reproducibility. Same blueprint, same pipeline, comparable output (modulo model variance).
- Batch production. Run 10 blueprints overnight. Review in the morning.
- Iteration. Change a character description, re-run. Swap from
filmtoclipprofile, re-run. Adjust shot pacing, re-run. The pipeline is the constant; the creative input is the variable.
What Does Not Work Yet
No real-time preview. Telos is a batch production tool, not an interactive editor. You write the blueprint, run the pipeline, and review the output. Runway’s real-time preview is better for exploration. Telos is built for production.
Character consistency is good, not perfect. Single-character shots work well, the casting reference anchors the model effectively. Ensemble casts (3+ characters in one shot) are harder. The model sometimes drifts from references when compositing multiple characters simultaneously.
Audio is basic. Kling generates native audio per shot, and FFmpeg assembles it. There is no dialogue synthesis, no voice acting, no scored music generation. Background ambience and environmental sound only.
No GUI. This is a terminal application. If you are not comfortable with a CLI, Telos is not for you. That is intentional, scriptability and automation require a CLI-first architecture.
Output quality depends on the generation models. Kling O3 Standard produces good results. Kling O3 Pro produces better results at 25% higher cost. When better models ship, Telos can plug them in, the pipeline is model-agnostic.
Install and run the environment check:
pipx install telos-engine
telos doctor
Nine checks, one command. If everything passes, try telos new --profile clip and write your first blueprint.