- Use the Sieve client package to call multiple existing functions in Python
- Combine these functions to create a customized app that meets our requirements
Introduction
As mentioned above, to create an app that can dub videos, we need several models to work together:- WhisperX: an audio transcription model
- SeamlessT2T: a text-to-text translation model
- XTTS-V1: a text-to-speech model
- Sieve Video Retalker: an optimized version of video retalker for lipsyncing
Building the app from scratch
1
Set up folder and Python file
Create a folder and Python file named
video_dubbing.py with the following command:2
Set up pipeline
Paste the following code into
pipeline.py. The higher level logic of this code is as follows:- Extract audio from the video
- Transcribe the audio
- Translate the transcript
- Generate new audio from the translated text
- Combine audio and video with our lipsyncer
3
Run the pipeline
Run the pipeline with the following command. Because we are running our function with You should start seeing some logs streaming in. You can also view the status of this job on the Sieve dashboard. After it has completed running, you’ll see a video file path printed to the console, which has been saved to a temporary directory. You can open this video to see the results.
video_dubbing.run, where video_dubbing is the local Sieve function, this will deploy the Sieve function and then run a job.4
View your dubbed file!
Open your video in your file explorer with the following command:Or just view the video on the Sieve dashboard!