Overview

Create professional-quality talking videos by animating a still photo of a person with realistic lip-sync and natural expressions. Choose between Realistic mode for high-fidelity, accurate lip-sync results, or Expressive mode for a more animated, expression-guided style—perfect for marketing videos, presentations, social media, and personalized messages.

Realistic is the default and recommended option for best results. It provides sharper details, more realistic skin textures, and superior lip-sync accuracy compared to Expressive mode.

Prerequisites

A Magic Hour account (free or paid)
A clear, front-facing photo of a person (JPG, PNG, or similar image format)
Audio source: text to convert to speech, a cloned voice, or an audio file (MP3, WAV, etc.)
Available credits (Free users can claim 100 credits daily up to seven times; generation cost varies by duration and mode)

Before You Begin

Plan your photo: For best results, use a high-quality, well-lit image where the person's face is clearly visible and facing forward. Blurry, side-angled, or heavily obscured faces will produce lower-quality results.

Credit usage: Credit cost is based on video duration, motion smoothness, and the generation mode selected. Realistic mode costs more per second than Expressive mode — check the credit estimate in the UI before generating. Free-plan users generate watermarked outputs at 576px resolution; upgrading unlocks higher resolutions and removes watermarks.

Step-by-Step Guide

Step 1: Access the Talking Photo Tool

Sign in to Magic Hour at magichour.ai.
Click Create in the top navigation.
Select Talking Photo from the dropdown menu. You'll be taken to the creation page.

Step 2: Select Your Generation Mode

On the creation page, look for the Generation Mode selector.
Choose your mode:
- Realistic (default) — Realistic detail, accurate lip sync, and faster generation
- Expressive — Animated style with prompt-guided expression and movement
Optional: Click the info icon next to "Generation Mode" to see a side-by-side comparison video.

Realistic mode is pre-selected for the best quality output, faster generation, and optimal lip-sync accuracy.

Step 3: Upload Your Photo

In the center of the screen, click the upload area or drag and drop your photo file.
Select a clear, front-facing image of the person whose photo you want to animate.
The photo will preview immediately below the upload area.

Alternative: If you don't have a photo ready, scroll down to the Preset Faces Carousel and select one of the sample faces to test the feature.

Step 4: Review Advanced Settings (Optional)

Below your photo preview, expand the Advanced Settings accordion.
In Expressive mode, you can enter a custom prompt to guide facial expression and movement.
In Realistic mode, expression is handled automatically — no additional adjustments are needed.

Step 5: Select Your Audio Source

Choose one of three audio options:

Option A: Upload an Audio File

Click Upload from device and select an MP3, WAV, or similar audio file.
The file uploads and displays in the audio player.

Option B: Use a Preset Audio

Browse and select from the available preset audio clips.

Option C: Use a Cloned Voice / Text to Speech

Select a voice and enter your script to generate a voiceover using an AI voice.

Step 6: Trim Audio Duration (Optional)

Below the audio player, adjust the Start Seconds and End Seconds sliders to select the portion of audio you want to use.
The maximum supported duration is 60 seconds.
The UI displays the trimmed duration and estimated credit cost before you generate.

Trim error: If you set "End Seconds" beyond the audio duration, you'll see an error. Adjust the slider to fix it.

Step 7: Review and Generate

Review your selections:
- Generation mode is set (Realistic or Expressive)
- Photo is uploaded and visible
- Audio is selected and trimmed
Check the estimated credit cost displayed on the Generate button.
Click Generate Talking Photo.

You'll see loading status messages as your video is processed.

Step 8: Download Your Video

Once generation completes, you'll be redirected to your project page with a success notification.
The video plays automatically.
Click Download Video to save the MP4 file to your device.

Verify the Setup

To confirm your talking photo was created successfully:

Video plays: The generated video shows your photo animating with clear lip movements.
Lip-sync accuracy: Mouth movements sync to the audio — words and syllables align visually.
Natural expressions: The face shows natural, realistic movements (Realistic mode) or prompt-guided expressions (Expressive mode).
Audio clarity: The audio plays clearly with no gaps or distortion.
Download file: The MP4 saves to your device and plays in any standard video player.

Troubleshooting

Issue	Likely Cause	Solution
Lip movements look unnatural or don't match audio	Photo is side-angled, blurry, or face is obscured	Upload a clear, front-facing photo with good lighting. Regenerate.
"Insufficient credits" error during generation	Not enough credits for the video length	Purchase a credit pack, or trim the audio to a shorter duration.
"The selected end seconds is past the end of the audio" error	End Seconds slider exceeds audio length	Drag the End Seconds slider back to within the audio duration.
Upload fails for image or audio file	File is too large or unsupported format	Ensure image is JPG/PNG and audio is MP3/WAV. Try uploading again.
Video quality is low or blurry	Free plan limits resolution to 576px; watermark present	Upgrade to a paid plan (Creator, Pro, or Business) to unlock higher resolution and remove watermarks.
Generation is taking a long time or stuck	Video is queued during high platform load	Wait a few minutes. Try a shorter clip. If stuck after 10 minutes, contact support.
Audio is distorted or cuts off	Audio file is corrupted or trimmed incorrectly	Verify the audio plays correctly on your device. Re-upload and check Start/End Seconds sliders.

Realistic vs. Expressive Mode

Feature	Realistic	Expressive
Lip-sync accuracy	Highly accurate	Good (less precise)
Visual fidelity	Sharp details, realistic skin texture	Animated, stylized
Facial expressions	Natural, realistic movements	Prompt-guided expression and movement
Generation speed	Faster	Standard
Custom prompt	N/A (auto-optimized)	Available
Best for	Professional videos, marketing, presentations	Creative experimentation, stylized effects

Realistic is ideal if you want the most realistic, polished results. Switch to Expressive if you want to guide expressions with a text prompt or prefer a more stylized, animated look.

Limitations

Video duration: Maximum supported duration is 60 seconds.
Face visibility: Realistic mode works best with clear, front-facing faces.
Resolution limits: Free-plan videos generate at 576px; paid plans unlock 1024px and higher.
Watermarks: Free-plan outputs include a watermark. Upgrade to remove watermarks.
No batch processing: Generate one video at a time.
Audio-only edits: You cannot edit the audio after generation; you must regenerate with a new audio file.

Best Practices

Use clear photos: High-quality, well-lit images with the face centered produce the best results.
Test audio quality: Ensure your audio is clear, without background noise, before generating.
Keep videos short: 10–30 seconds is optimal. Longer videos consume more credits.
Avoid extreme angles: Straight-on or slightly angled photos work best.
Use natural audio: Clear speech with normal pacing syncs better than mumbled or very fast audio.
Regenerate if needed: If the first result doesn't meet your needs, regenerate with updated inputs (costs additional credits).

What's Next

Explore combining Talking Photo with other tools: generate a Talking Photo, then use Lip Sync or Face Swap for advanced effects.
Experiment with different audio sources: voice clone, text-to-speech, or uploaded audio.

Getting Help

If you encounter issues or have questions:

Email support: [email protected]
Community: Join the Magic Hour Discord to share feedback and ask questions.

When contacting support, include:

The photo and audio files you used (or descriptions of them)
Steps you took when the issue occurred
Any error messages you received
Your account email address

How to Create a Talking Photo