Slatesslates
EXPLAINERai transition

AI Transition: What It Is and How to Make One

An AI transition is a scene change generated by an AI video model that morphs from a start frame to an end frame. The current best models for AI transitions are Kling V3 and Veo 3.1 because both support first-frame and last-frame inputs, which is exactly the workflow a transition needs. A typical AI transition runs $0.42 to $1.20 per generation.

In one sentence

An AI transition is a video clip generated by an AI model that morphs smoothly from a start frame to an end frame, used in place of a hard cut between two scenes.

What an AI transition actually is

An AI transition is a video clip generated by an AI model that morphs smoothly from a start frame to an end frame. You feed the model two reference images (the first frame of the transition and the last frame), and the model generates the motion between them.

The result is a clip where the visual content changes during the shot in a way that would be impossible to film in real life. Same person, different outfit. Same scene, different time of day. Same character, different age. The AI generates every frame in between to make the change look continuous.

This isn't a new concept (motion graphics tools have done morph transitions for years) but the AI version is much faster and much cheaper than the manual version. A motion graphics morph that used to take a few hours of After Effects work can be generated in 30 seconds at the cost of a few generations.

How to actually generate one

The workflow uses the first-frame and last-frame inputs that Kling V3 and Veo 3.1 both support. Both models accept two reference images and a duration, and they generate a clip that morphs from the first image to the second over the duration you specify.

Step one: generate or pick the start frame. This can be a still you generated in Nano Banana 2, a screenshot from existing footage, or a photo you took yourself. The model just needs an image to anchor the transition's start.

Step two: generate or pick the end frame. Same options. Make sure the framing roughly matches the start frame so the transition feels continuous instead of jarring. Same camera angle, same general composition.

Step three: feed both frames to Kling or Veo with first-frame and last-frame parameters. Add a short prompt describing what should change during the transition ("outfit morphs from casual to formal," "scene transitions from day to night"). Generate.

Step four: the model returns a 5-10 second clip that morphs from your first frame to your last frame. Drop it onto your timeline as a transition between two regular shots, or use it as a standalone clip if the transition itself is the content.

What AI transitions are good at and where they fall down

AI transitions are great for visual changes that would be expensive to fake with motion graphics. Outfit swaps, location changes, time-of-day shifts, age progression, and object metamorphosis all work because the model can interpolate between visually similar start and end frames.

They fall down when the start frame and end frame are too different. If your start frame is a wide shot of a forest and your end frame is a close-up of a face, the model has nothing to interpolate. The result will be a chaotic morph that looks like the model is guessing (because it is). Keep the framing similar between the two frames.

They also fall down on transitions that need precise timing. AI video models don't give you frame-level control over when specific events happen during the clip. So if you need an outfit change to happen at exactly 1.5 seconds into a 3 second clip, the AI version is going to feel imprecise compared to a hand-keyed motion graphics morph.

What AI transitions cost

Per-clip pricing is the same as any other AI video generation. Kling V3 Standard at $0.084 per second runs $0.42 for a 5-second transition or $0.84 for a 10-second one. Veo 3.1 Fast at $0.10 per second is roughly the same rate.

So a typical AI transition costs about $0.50 to $1.00 per generation. You usually run 2-3 generations to pick the best version, which puts the all-in cost per finished transition at $1.50 to $3.00.

Compare that to motion graphics work that runs $200 to $1000 per minute of finished output, and the math is friendly. AI transitions don't replace serious motion graphics work for client projects with specific requirements, but they replace 80% of the routine morph and transition work that used to eat hours per project.

Frequently asked questions

What is an AI transition?+

An AI transition is a video clip generated by an AI model that morphs smoothly from a start frame to an end frame. You feed the model two reference images and a short prompt, and the model generates the motion between them. Used in place of hard cuts when you want a visual morph (outfit change, location shift, time of day) instead of a clean edit.

Which AI models support transitions?+

Kling V3 and Veo 3.1 both support first-frame and last-frame inputs, which is the workflow an AI transition needs. Kling is cheaper at $0.084 per second with your own fal.ai API key. Veo Fast starts at $0.10 per second and goes higher for 4K and audio. Seedance 2.0 also supports first/last frame mode but its strict face filters limit it for character transitions.

How much does an AI transition cost to generate?+

About $0.50 to $1.00 per generation in raw API cost on Kling V3 Standard, depending on the clip length. With 2-3 iterations to pick the best version, the all-in cost per finished transition runs $1.50 to $3.00. Compare that to motion graphics work at $200 to $1000 per minute and the AI version replaces most routine morph work.

Can I make an AI transition from a real photo?+

Yes. The first frame and last frame inputs both accept any image, including real photos you took yourself. So you can transition from a real photo of yourself to an AI-generated end frame, or from one real photo to another real photo with AI interpolation in between. The model treats both inputs the same regardless of source.

Why do my AI transitions look chaotic in the middle?+

Usually because the start frame and end frame are too visually different for the model to interpolate cleanly. If your start frame is a wide shot and your end frame is a close-up, the model has no anchor and the middle of the clip looks like guessing. Keep the framing, composition, and subject placement roughly consistent between the two frames.

Related

Generate AI transitions in a real workflow

Slates ships first/last frame support for Kling and Veo with reference image management built into the storyboard system. Generate transitions in seconds, drop them on the timeline, and export to a finished video.

Get Slates

One-time purchase · 30-day money-back guarantee