Introduction

Let’s say you want to transcribe RTMP or HLS livestreams. Let’s walk through deploying a custom function to Sieve that can ingest a livestream and pass it into other functions using Sieve’s function calling capabilities for the purposes of transcription & translation.

Using the python client

Ensure that the Sieve python package is installed on your system by executing the following command

pip show sievedata

If installed successfully, you will see an output detailing the package’s version, summary, dependencies, etc.

Then make sure to log in with your API key that can be found in the dashboard settings.

sieve login

Clone the Sieve examples repository and navigate into the audio_transcription directory.

git clone https://github.com/sieve-community/examples
cd examples/audio_transcription

Let’s now deploy a custom function to Sieve that can ingest a livestream and pass it into other functions using Sieve’s function calling capabilities by running the following command:

We won’t dive too deep into what the code does but at a high level it:

  1. Imports necessary libraries and modules.
  2. Loads the sieve/whisperx model and sieve/seamless_text2text model for translation.
  3. Takes in audio from ffmpeg at a specified chunk size, splits by the last silence in the audio to prevent splitting words.
  4. Runs transcription and translation if needed and yields out results with its start and end time.
sieve deploy live_transcriber.py

Congrats, you just deployed a custom function to Sieve! Replicas of the live transcriber automatically scale to zero when there is no traffic, so be sure to check out the autoscaling guide to learn more about how to configure autoscaling for your custom functions if you want to keep them running and ready to process requests. You may otherwise notice a slight delay when you first call the function as the replicas are scaled up.

Let’s now use it to transcribe a live audio stream in Python. Make sure to replace {your_org} with your actual organization name. You can find your organization name in the dashboard settings.

import sieve

url = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service"
language = "eng" # If this language is different from the stream, translation will happen automatically

live_speech_transcriber = sieve.function.get("{your_org}/live_transcriber")
output = live_speech_transcriber.push(url, target_language=language)

print(f"Started job with id {output.job['id']}")

print("stream ingestor initializing (please set a replica if you'd like for this to be instant)...")
for segment in output.result():
    print(segment)

The segment output will print as it’s being transcribed. You may also push a job via the REST API here and later poll for outputs with this endpoint.

This is an example of a job that will not cancel automatically as it’s processing a stream! To cancel the job once you’re done, you may use the cancel API linked here. You may also cancel the job via the dashboard.

Streaming VTT captions

If you’d like to stream VTT captions directly to your video player, please follow our demo here.