Example: Live Stream Transcription + Translation
Transcribe & translate live audio streams in real-time
Introduction
A common usecase for transcription is transcribing RTMP or HLS livestreams. Let’s walk through deploying a custom function to Sieve that can ingest a livestream and pass it into other functions using Sieve’s function calling capabilities for the purposes of transcription & translation.
Clone repo and run the example
Login with Sieve client
To get started, make sure you’re logged in via the Sieve Python client:
sieve login
Clone example
Clone the Sieve examples repository and navigate into the audio_transcription
directory.
git clone https://github.com/sieve-community/examples
cd examples/audio_transcription
Deploy custom function
The code we’ll be deploying can be found at examples/audio_transcription/live_transcriber.py
.
This is an example of a custom function to Sieve that can ingest a livestream and pass it into other functions using Sieve’s function calling capabilities.
At a high level, this module does the following:
- Imports necessary libraries and modules.
- Loads the
sieve/whisperx
model andsieve/seamless_text2text
model for translation. - Takes in audio from
ffmpeg
at a specified chunk size, splits by the last silence in the audio to prevent splitting words. - Runs transcription and translation if needed and yields out results with its start and end time.
Let’s deploy our function!
sieve deploy live_transcriber.py
Replicas of the live transcriber automatically scale to zero when there is no traffic, so be sure to check out the autoscaling guide to learn more about how to configure autoscaling for your custom functions if you want to keep them running and ready to process requests. You may otherwise notice a slight delay when you first call the function as the replicas are scaled up.
Try it out!
Let’s now use our function to transcribe a live audio stream in Python. Make sure to replace {your_org}
with your actual organization name. You can find your organization name in the dashboard settings.
import sieve
url = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service" # Public BBC service
language = "en" # If this language is different from the stream, translation will happen automatically
live_speech_transcriber = sieve.function.get("sieve-internal/live_speech_transcriber")
output = live_speech_transcriber.push(url, target_language=language)
print(f"Started job with id {output.job['id']}")
print("stream ingestor initializing (please set a replica if you'd like for this to be instant)...")
for segment in output.result():
print(segment)
The segment output will print as it’s being transcribed. You may also push a job via the REST API here and later poll for outputs with this endpoint.
This is an example of a job that will not cancel automatically as it’s processing a stream! To cancel the job once you’re done, you may use the cancel
API linked here. You may also cancel the job via the dashboard.
Streaming VTT captions
If you’d like to stream VTT captions directly to your video player, please follow our demo here.
Was this page helpful?