Skip to content

Capturing Your Talking Footage

A recording guide for performers.

What this is for

This footage teaches a model how you look and move when you talk — your mouth and jaw, your expressions, the way your head shifts, the gestures that come with speech. It learns what the camera sees, so what matters is visible, natural motion, not what you're actually saying. You're not performing a script; you're giving the model the full range of how you talk.

One thing to set expectations: this teaches a natural talking presence — believable mouth and face movement — not exact word-for-word lip-sync. So you don't need to hit specific lines. You need to talk naturally, across a real range.

Talk at different energy levels

This is the biggest visual difference, so cover it well. Record yourself talking at each level:

  • Calm and low-key — small mouth movement, little gesture, relaxed
  • Conversational — your normal speaking manner
  • Animated and emphatic — bigger expressions, gestures, more head movement

Add a little whispering (minimal mouth, intimate, lowered brow) and, sparingly, some louder/raised speech if you want the character to reach that intensity. You don't need much of the extremes — a few good clips each.

Talk with different expressions

Speak while carrying different feelings, so the model can talk and emote at once:

  • Neutral
  • Happy — smiling or laughing while talking
  • Serious or concerned
  • Surprised

Add sad or angry only if you need them — and when you do, make them clearly visible rather than subtle, since faint expressions don't read on camera.

Vary gaze, angle, and framing

  • Eyes to camera — the direct, monologue feel
  • Looking off-camera — as if talking to someone beside you
  • A few head angles — straight on, and a three-quarter turn to each side. Avoid extreme side profiles.
  • Shot sizes — both close-up and medium, but always keep your face a meaningful part of the frame.

Cover the mouth states

  • Active speech — actually talking, so the mouth moves through its natural range
  • Pauses and listening — mouth closed, reacting and present, as if hearing someone
  • Laughing — a real laugh mid-conversation

Keep your face clear

Your face needs to be clearly visible and well-lit in every clip. Avoid anything that blocks it — hands over your mouth, hair across your face, a mic in front of your lips. Same person, same face, clearly readable throughout.

How to record each clip

  • One continuous shot per clip. Don't cut or splice within a clip — record it in a single take. Aim for around 5 seconds.
  • Keep it sharp. Drop anything blurry or out of focus around the face; soft footage produces a smeared, mushy mouth.
  • Move naturally, but don't shake. Natural head movement and gesture is exactly what you want. A near-frozen clip teaches a stiff result; a violently shaky one turns to mush. Aim for relaxed, intentional motion.
  • Stay consistent. Same recording setup and frame rate across your whole set.

Vary your setting — within reason

Record across a few different backgrounds and lighting setups rather than the same room every time, so the model doesn't fuse the scenery onto you. But legibility always wins: never let "variety" leave your face dim or hidden. A handful of settings is plenty — the point is to break the sameness, not to chase variety for its own sake.

Quality over quantity

A smaller, clean, well-varied set beats a huge messy one. Record more than you think you need so you can keep the best, but a few dozen strong, varied clips are worth more than hundreds of similar ones.


What to record — talking shot list

Work through these so each one is represented several times. You don't need every combination — you need each item covered.

Energy levels

  • Calm, low-key talking
  • Normal conversational talking
  • Animated, emphatic talking
  • Whispering
  • Louder / raised speech (a few clips)

While talking, in each mood

  • Neutral
  • Happy — smiling, laughing mid-speech
  • Serious / concerned
  • Surprised
  • (Sad or angry, only if needed — make them clearly visible)

Gaze and framing

  • Eyes to camera (monologue)
  • Looking off-camera (talking to someone)
  • Straight-on, three-quarter left, three-quarter right
  • Close-up and medium shots

Mouth states

  • Active natural speech
  • Pausing and listening (closed mouth, reacting)
  • Laughing while talking