1. Query
  2. Overview

There are many ways to think of Sieve. An AI processing engine, a queryable database, a way to finetune computer vision models over time, etc. Below, we explore the database aspect of Sieve, and how information is stored and can be retrieved from it.

Everything in a video is an object

A person, a car, a dog, and even the frame itself. Videos consist of objects defined by various properties.

Objects have properties that change over time

Objects can more specifically be defined by properties that do and don’t change over time. For example, every object might have a class attribute such as person, car, or something else which doesn’t change. However, other items such as position, speed, lighting, and others do.


Traditionally, videos could only be “queried” by a timestamp to find the information in that frame. Sieve instead takes an object-first approach. Sieve treats everything as an object. Every object has stationary attributes such as class, start_frame, end_frame, object_id, and more. Every object also has attributes which change over time. This could include information such as position, speed, size, and more.

Video frames are also treated as objects (which has bounding boxes that take up the entire frame). They can also have their own attributes such as lighting, resolution, or other classifications performed on the frame-level.

Sieve automatically tracks objects across frames which is why this information is retrievable in this way. Whether it’s an object detection, semantic segmentation, or other classification task, Sieve’s paradigm is the same.