How to Create Lip-Sync and Talking-Head Videos in Seedance 2.0 (2026)

Coding Liquids blog cover featuring Sagnik Bhattacharya for the Seedance 2.0 lip-sync and talking-head guide.
Coding Liquids blog cover featuring Sagnik Bhattacharya for the Seedance 2.0 lip-sync and talking-head guide.

Talking-head and lip-sync video is the hardest thing to do well in any text-to-video model, and Seedance 2.0 is no exception. Most beginner tutorials overpromise — "just type 'character speaking' and Seedance will do the rest" — and then quietly ship clips where the mouth flaps randomly and the face melts halfway through. This guide is the honest version. It tells you what Seedance can do for faces and talking heads, what it cannot, and how to build a two-tool workflow that produces usable results today.

By the end you will know when to use Seedance alone, when to pair it with a dedicated lip-sync tool, and when to reach for a different solution entirely.

I teach Flutter and Excel with AI — explore my courses if you want structured learning.

Quick answer

Seedance 2.0 can generate believable talking-head motion — subtle head movement, blinks, generic mouth activity — but it cannot tightly sync mouth shapes to specific words in an audio track. The workflow that actually ships is: generate a silent Seedance clip of a character talking, then overlay tight lip sync with a dedicated lip-sync tool that takes your audio and retargets the mouth shapes. Seedance is the base video; the specialist tool handles the mouth.

  • You want a character to deliver a line to camera without paying an actor.
  • You have tried "just type it in Seedance" and the mouth flaps randomly.
  • You need to make a short talking-head explainer and are picking a stack.
Follow me on Instagram@sagnikteaches

What Seedance 2.0 can and cannot do for faces

Let me be blunt about the limitations up front, because most of the frustration around Seedance talking heads comes from expecting the wrong thing.

  • Can do: generic talking mouth motion, natural head movement, eye blinks, soft expression changes, subtle emotional reads.
  • Can do with care: a consistent character identity across multiple clips, if you use the same reference image and prompt.
  • Cannot do: tight word-level lip sync to a specific audio track. The model does not know what the audio says.
  • Cannot do reliably: photoreal human faces with no warping across a full 10 seconds — expect some identity drift near the end.

Once you accept that Seedance produces believable talking motion rather than synced speech, the workflow becomes obvious: use Seedance to generate the base video of the character in motion, and layer the mouth sync separately.

Pick the right starting image

Character work lives or dies on the reference image. Seedance will do a competent job animating a clean portrait and a terrible job animating a complicated one. What makes a good reference image for talking-head work:

  • Medium close-up framing — shoulders-up to chest-up. Too tight and the model has nothing to animate around the face; too wide and the face is too small to animate well.
  • Face clearly visible and front-facing — slight angle is fine, extreme profile is not.
  • Soft even lighting — hard side light creates shadows that shift weirdly when the model animates.
  • Mouth closed or slightly parted — neutral mouth starting state animates more naturally than a huge grin.
  • Simple background — a solid colour, soft bokeh, or a minimal environment. Busy backgrounds distract the model from the face.

The reference images for characters guide goes deeper on picking portraits that animate well. Read it alongside this post if you are making multiple talking-head clips.

Prompt structure for talking-head clips

The prompt for a talking head looks different from a typical Seedance prompt. You are not asking for a dramatic camera move — you are asking for subtle, natural human motion that sells "this person is speaking".

A working pattern:

  1. Subject description — who the character is, clearly.
  2. Framing cue — "medium close-up, chest-up, eye contact with camera".
  3. Motion cue — "subject speaking naturally to camera, subtle head movement, natural blinks, mouth moving as if talking".
  4. Lighting cue — "soft even lighting, gentle rim light".
  5. Mood cue — "calm, professional, warm" — this drives micro-expression.

Notice what is not in that prompt: specific words, dramatic camera moves, high-energy verbs. Those all fight against face stability. For more on the prompting principles behind this, see better Seedance prompts.

Settings for talking heads

SettingValueWhy
ModeImage-to-videoText-to-video identity is fragile for faces
Reference imageClean portrait, medium close-upGives the model a stable face to preserve
Duration5 secondsFace stability drops after ~6 seconds
Resolution1080p for finalsFace detail matters at playback size
Aspect ratio9:16 or 1:1Vertical or square keeps face large on phone
Motion intensity30–45Low enough to preserve identity

Do not push motion intensity above 50 for talking heads. Every step up increases the chance of the face drifting into someone who looks slightly different by the end of the clip. The motion intensity guide explains the tradeoff in more depth.

Connect on LinkedInSagnik Bhattacharya

The two-tool workflow for real lip sync

Here is the workflow that actually produces usable talking-head content in 2026.

  1. Write your script. Keep each line short — 5 to 10 seconds of speech per Seedance clip.
  2. Record or generate the voiceover. Use a voice actor, your own voice, or a text-to-speech tool.
  3. Generate a silent Seedance clip of the character "speaking to camera" at the same length as the audio line.
  4. Run the silent clip through a dedicated lip-sync tool. The tool takes your Seedance video and your audio file, and retargets the mouth shapes to match the words. This is the step that produces real sync.
  5. Import the synced clip into your editor alongside any other footage, background music, and captions.
  6. Repeat per line and stitch into the final video.

Skipping step 4 — the dedicated lip-sync tool — is the single biggest reason Seedance-only talking-head attempts look amateur. The mouth needs a specialist; Seedance provides the base identity and natural motion around it.

Keeping the same character across multiple clips

If you are making a multi-clip explainer with the same character, consistency is the hard part. Seedance does not natively "remember" a character across sessions — you have to enforce identity yourself.

  • Use the exact same reference image for every clip in the sequence. Do not swap it for a different angle halfway through.
  • Keep the subject description identical in every prompt — copy-paste the character block, do not paraphrase.
  • Fix the seed if the platform exposes it. Same seed plus same reference plus same prompt gives the most consistent results.
  • Keep framing consistent — always medium close-up, always the same rough angle.
  • Grade clips in your editor to match colour and exposure if they drift slightly.

For deeper technique, read consistent characters in Seedance — it is the companion to this post for multi-clip character work.

Common talking-head mistakes

MistakeSymptomFix
Expecting perfect lip sync from Seedance aloneMouth flaps randomly, feels uncannyAdd a dedicated lip-sync tool in post
High motion intensity on facesIdentity drifts mid-clipCap at 45 for talking heads
Busy reference portraitModel loses the face to backgroundUse a clean simple portrait
10-second talking-head clipsFace morphs near the endUse 5-second clips and stitch
Different reference per clipCharacter looks inconsistentOne reference image, reuse it
Writing actual dialogue into the promptPrompt ignores the wordsDescribe motion, not dialogue content
Subscribe on YouTube@codingliquids

Worked example: a 30-second explainer with one character

Here is the full pipeline for a 30-second explainer featuring a single AI character delivering three lines to camera.

  1. Script three lines, each about 10 seconds of speech. Record voiceover (or generate with a TTS tool) as three separate audio files.
  2. Pick one clean portrait that fits the character. Medium close-up, soft lighting, simple background, neutral mouth.
  3. Generate three Seedance clips using the same portrait, the same prompt except for subtle mood variation ("calm and warm", "slightly excited", "thoughtful and serious"). Motion intensity 40, duration 5s each.
  4. Run each clip through a dedicated lip-sync tool with its matching audio file. You now have three video files with synced mouths.
  5. Open a video editor (CapCut, Premiere, Resolve). Drop the three synced clips on the timeline with brief crossfades.
  6. Layer background music at low volume, add captions for accessibility, grade all three clips to match colour.
  7. Export at 1080p vertical for social or 1080p horizontal for YouTube. Done.

Total time: 1–2 hours for someone who has done this pipeline before. The tricky parts are not the Seedance generations — they are the script, the audio, and the consistency across clips.

When Seedance is not the right tool for talking heads

Be honest about this. If your project needs a single character delivering 60+ seconds of dialogue with perfect sync and photoreal fidelity, Seedance is not the fastest path. A dedicated AI avatar tool will produce tighter sync with less work. Seedance wins when you want a unique stylised character, a non-photoreal look, or a specific cinematic framing that avatar tools cannot produce — and you are willing to add the lip-sync tool in post to make it work.

For audio pairing generally (not just dialogue), Seedance audio prompts covers the broader topic of making clips that work well with a soundtrack.

Related guides on this site

Talking-head work pulls in several other Seedance topics. These are the natural companions.

Want to use AI tools more effectively?

My courses cover practical AI workflows, from spreadsheet automation to app development, with real projects and honest tool comparisons.

Browse AI courses