Image-to-video work is where Gemini Omni becomes more controllable. Instead of asking the model to invent every visual detail, you give it a reference: a product photo, character sheet, dashboard screenshot, storyboard, or style frame.
This guide shows how to prepare those references, write prompts that say what to preserve, and review the result. For general prompt structure, read Gemini Omni prompts. For spreadsheet and chart videos, use Gemini Omni for Excel after this.
Note: The official prompt guide says Gemini Omni can use references, including images, video, text, and audio. This article focuses on still images and storyboards because they are the easiest starting point for most creators.
Note: Do not use reference images you do not have rights to use, especially for commercial work or likeness-sensitive content.
Quick answer
For image-to-video, upload one clean reference, state exactly what should be preserved, describe one motion path, choose aspect ratio and duration, then review whether the generated video kept the reference consistent.
- You have a product photo, dashboard screenshot, storyboard, or character reference.
- You need more consistency than text-only prompting gives you.
- You want to turn still assets into editable AI video clips.
Choose The Right Reference Image
A good reference image is clean, high contrast, and unambiguous. The model should not have to guess which object matters. If the image contains five products, three people, and tiny text, the output will be harder to control.
For product videos, use front, side, and detail shots only when each shot has a clear purpose. For characters, use a character sheet. For dashboards, use a clean screenshot with the exact chart or KPI that matters.
| Reference | Best preparation | Common failure |
|---|---|---|
| Product photo | Centred product, clean background, readable label. | Extra props become part of the product story. |
| Character image | Consistent outfit, front-facing or clear pose. | Identity drifts during motion. |
| Dashboard screenshot | Large chart labels and a simple layout. | Numbers or tiny text become unreadable. |
| Storyboard grid | Numbered panels or clear left-to-right order. | Scene order becomes ambiguous. |
Reference Prompt Pattern
The prompt must tell Gemini Omni which part of the reference matters. Otherwise a product reference might preserve the table but change the bottle, or a dashboard reference might preserve the colour palette but distort the numbers.
Use this order: preserve, animate, camera, style, constraints.
Use the uploaded image as the product reference. Preserve the bottle shape, cap, label colours, and front-facing logo. Create an 8-second vertical video where the bottle rotates slowly once on the same wooden desk. Camera locked-off, soft morning light, clean product demo style. Do not add new text or extra props.Storyboards For 10-Second Clips
Storyboards are useful when you need a clip to follow a sequence. The official prompt guide mentions sharing a visual storyboard and asking Gemini Omni to follow the story in order. The practical trick is to keep panels few and readable.
For a 10-second clip, use three or four panels. More panels may work, but it gives the model less time to resolve each beat cleanly.
- Panel 1: opening hook or establishing frame.
- Panel 2: main action or transformation.
- Panel 3: result, proof, or reveal.
- Panel 4: optional final frame for captions or a call to action added later in your editor.
Keeping Scenes Consistent
Consistency is not only about the subject. The environment, lighting, camera angle, and time of day also have to hold together. If you need multiple clips from the same world, create a small reference pack rather than changing the reference each time.
For YouTube Shorts, consistency also means the viewer can understand the story quickly. Pair this guide with Gemini Omni for YouTube Shorts if the output will be vertical short-form content.
- Use the same hero reference for every scene that needs the same product or character.
- Repeat the same camera and lighting language across prompts.
- Keep one style phrase fixed across the sequence.
- Review frame grabs from every clip side by side before editing.
Image-To-Video Benchmark Framework
A fair image-to-video benchmark uses the same reference image across every model or prompt variation. Keep the output duration, aspect ratio, and motion instruction constant. Score reference preservation separately from general beauty.
For Excel and dashboard videos, add a data-integrity score: did the clip preserve the chart direction, headline number, and relative ranking? If the generated video misrepresents the data, it fails even if it looks polished.
| Metric | Question | Fail condition |
|---|---|---|
| Reference preservation | Does the product, character, or chart remain recognisable? | Core object changes shape or identity. |
| Motion logic | Does the motion fit the still image? | Impossible limbs, warped product, or drifting chart. |
| Scene consistency | Do lighting and environment stay coherent? | Background changes without instruction. |
| Data integrity | Are numbers and chart directions respected? | Video implies a different business insight. |
Step-by-step: turn one reference image into a clip
This is the safest beginner workflow for image-to-video because it uses one reference and one movement. Add more references only after this basic version works.
- Prepare the image. Crop the reference so the main subject is obvious. Remove clutter, tiny text, and background elements that the model might treat as important.
- Name what must be preserved. Write a short list: product shape, logo position, chart direction, character outfit, colour palette, or room layout.
- Choose one motion. Examples: slow rotation, camera push-in, pages turning, product lid opening, chart bars rising, or character walking forward.
- Choose the output format. Use landscape for explainers and vertical for Shorts. Mention the intended platform if framing matters.
- Write the prompt in five parts. Reference, preservation, motion, camera, constraints.
- Generate one version. Do not change the reference or add a second asset until you know whether the first image is being preserved.
- Review against the reference. Put the reference beside the output and compare shape, colour, layout, and meaning.
- Make one correction. If the logo changes, fix only logo preservation. If the motion is too fast, fix only motion speed.
Reference prompt worksheet
Fill this out before writing the final prompt. It prevents vague instructions such as "make this image move".
| Field | Example |
|---|---|
| Reference role | Use the uploaded product photo as the exact bottle reference. |
| Preserve | Bottle shape, green cap, white label, front logo position. |
| Animate | Rotate slowly once on the desk. |
| Camera | Locked-off camera, medium close-up. |
| Style | Clean product demo, soft morning light. |
| Constraints | No extra props, no generated text, no label changes. |
| Final frame | End with the bottle centred and still. |
After-generation review
- Take three frame grabs: first second, middle, final frame.
- Compare each frame to the original reference.
- Reject the clip if the core object changes shape or identity.
- Reject dashboard clips if the trend, ranking, or message changes.
- Keep the prompt and reference together so the result can be repeated.
Asset-prep checklist
Most bad image-to-video results start before the prompt. Prepare the reference so Gemini Omni does not have to guess what matters.
- Use one hero subject per image when possible.
- Crop out unrelated props, watermarks, and tiny UI details.
- Keep important labels readable, but do not depend on generated video to preserve small text.
- Export charts and dashboards at a size where the trend is visible without zooming.
- For products, include the cleanest angle first; use extra angles only when they clarify shape.
- For characters, keep outfit, face angle, and lighting consistent across references.
- Name assets by role, such as
product-front-reference.pngordashboard-q2-revenue.png.
Common mistakes
- Uploading a busy reference and expecting the model to know which object matters.
- Changing the reference image between scenes that need consistency.
- Using tiny chart text as the only source of a data story.
- Asking for too many storyboard beats in a short clip.
- Ignoring rights and likeness questions around reference assets.
Related tutorials
These tutorials connect Gemini Omni image-to-video work with prompt structure, editing, Excel visuals, and comparable Seedance workflows.
- Best Gemini Omni Prompts for AI Video: Camera, Motion, Style, and References
- Gemini Omni Tutorial: How to Create Your First AI Video Step by Step
- Gemini Omni Video Editing: Multi-Turn Edits, Camera Changes, and Style Transfers
- Gemini Omni for Excel: Turn Charts, Dashboards, and Spreadsheet Insights into AI Videos
- How to Use Gemini Omni for YouTube Shorts: 10-Second Clips, Remixing, and Safe Publishing
- How to Use Seedance 2.0 for Image to Video Prompts
- How to Make Consistent Characters in Seedance 2.0
- How to Make Professional Charts in Excel (Step-by-Step Guide)
Sources
These official references are useful if you need the product or framework documentation alongside this guide.
- Google DeepMind: Gemini Omni prompt guide
- Google DeepMind: Gemini Omni model overview
- Google Flow Help: models and supported features
Want to create better AI content?
My courses cover practical AI workflows for content creation, video production, and marketing with real projects.
Browse courses