Autoscaling
Autoscaling your functions and models on Sieve
Sieve automatically scales functions up and down based on traffic sent to them. The autoscaler implementation is based on job queue length, current replica count, and average setup time (which depends on your container size and setup function). To handle bursty workloads better, we also keep replicas up for some time after a job is completed. For more granular control over this autoscaling behavior, feel free to reach out!
You can configure the minimum and maximum replicas for a function on the function version settings page. Your function will scale up to your minimum replica count even in the absence of job traffic.
For use cases where cold starts greatly affect performance, we suggest keeping a few min replicas up. However, your usage does also include idle time (time where your function is not processing any jobs). To track your usage, refer to the Usage tab in the dashboard.
Was this page helpful?