Skip to content

Your First Project

import { Steps } from ‘@astrojs/starlight/components’;

This guide walks you through creating your first dubbed video. It assumes UltiVoice is installed and activated — if not, start with Installation.

The whole process takes about 5 minutes of hands-on time (plus processing time depending on video length and your hardware).

  • A video file: MP4, MKV, MOV, or AVI — up to ~2 hours. For your first test, use a short clip (1–3 minutes) to keep processing time low.
  • The source language of the video (the language being spoken).
  • Your target language (the language you want to dub into).
  1. Create a new project

    Click New Project on the home screen or from File → New Project. Give it a name and choose a folder where output files will be saved.

  2. Import your source video

    Click Add Source Video and select your file. UltiVoice reads the video metadata and displays duration, resolution, and detected audio tracks.

  3. Set the source language

    In the Source dropdown, select the language spoken in the video. If you’re unsure, leave it on Auto-detect — Whisper will identify the language during transcription.

  4. Set the target language

    In the Target dropdown, pick the language you want to dub into. You can add multiple target languages and process them in one run.

  5. Choose a voice

    Under Voice, select a preset voice for the target language, or click Clone voice to use a reference audio clip from the original speaker. For your first run, a preset voice is recommended.

    See Voice Selection & Cloning for full detail.

  6. Run the pipeline

    Click Start Dubbing. UltiVoice runs the pipeline in order:

    • Transcribe — Whisper reads the audio and produces a timed transcript.
    • Translate — Each segment is translated into the target language.
    • Synthesise — TTS generates dubbed audio for each segment.
    • Mix & render — FFmpeg assembles the final video.

    Progress is shown per-stage. On a mid-range machine with a GPU, a 3-minute video typically completes in 5–10 minutes.

  7. Review and export

    When the pipeline finishes, the output video is previewed in the player. If any segments sound off, you can edit the translated text and re-synthesise individual segments without rerunning the full pipeline.

    Click Export to save the final video. See Export & Download for output format options.

Now that you’ve done your first dub, explore the full workflow guides: