Skip to content

DREEM Models

User-facing models

There are two main model APIs users should interact with.

  1. GlobalTrackingTransformer is the underlying model architecture we use for tracking. It is made up of a VisualEncoder and a Transformer Encoder-Decoder. Only more advanced users who have familiarity with python and pytorch should interact with this model. For others see below
  2. GTRRunner is a pytorch_lightning around the GlobalTrackingTransformer. It implements the basic routines you need for training, validation and testing. Most users will interact with this model.

Model Parts

For advanced users who are interested in extending our model, we have modularized each component so that its easy to compose into your own custom model. The model parts are

  1. VisualEncoder: A CNN backbone used for feature extraction.
  2. Transformer which is composed of a:
  3. An AttentionHead which computes the association matrix from the transformer output.