Skip to content

Usage

This page gives detailed instructions for using DREEM. We also have a quickstart guide and notebooks in the Examples section to help you get started.

Installation

Head over to the installation guide to get started.

Training

DREEM enables you to train your own model based on your own annotated data. This can be useful when the pretrained models, or traditional approches to tracking don't work well for your data.

Generate Ground Truth Data

To train a model, you need:

  1. A video
    • For animal data see the imageio docs for supported file types. Common ones include mp4, avi, etc.
    • For microscopy data we currently support .tif stacks.
  2. A ground truth labels file in SLEAP or CellTrackingChallenge format. This labels file must contain:
    1. Detections (i.e. locations of the instances in each frame). This can come in the form of centroids or pose keypoints for SLEAP format data, or segmentation masks for Cell Tracking Challenge format data.
    2. Ground truth identities (also called tracks). These are temporally consistent labels that link detections across time.

Get Initial labels

To generate your initial labels we recommend a couple methods:

  • For animal tracking, we recommend using SLEAP. SLEAP provides a graphical user interface that makes it easy to annotate data from scratch, and output the labels file in the SLEAP format.
  • For microscopy tracking, check out CellPose or Ilastik. These methods output segmentation masks, but do not provide tracks. Fiji offers several end-to-end segmentation and tracking options. Recall that your labels file must contain tracks.

Proofreading

Once you have your labels file containing initial detections and tracks, you'll want to proofread your labels. Obtaining good results relies on having accurate ground truth tracks. The annotated data should follow these guidelines:

  1. No identity switches. This is important for training a model that maintains temporally consistent identities.

  2. Good detections. Since the input to the model is a crop centered around each detection, we want to make sure the coordinates we crop around are as accurate as possible.

We recommend using the sleap-label GUI for proofreading. SLEAP provides tools that make it easy to correct errors in tracking.

Converting data to a SLEAP compatible format

In order to use the SLEAP GUI you'll need to have your labels and videos in a SLEAP comptabile format. Check out the sleap-io docs for available formats. The easiest way to ensure your labels are compatible with sleap is to convert them to a .slp file. Otherwise if you used a different system (e.g DeepLabCut) check out sleap.io.convert for available converters. With microscopy, we highly recommend starting out with TrackMate and then proofread in SLEAP's gui. Here is a converter from trackmate's output to a .slp file. In general, you can use sleap-io to write a custom converter to .slp if you'd like to use the sleap-gui for proofreading.

Organize data

For animal tracking, you'll need video/slp pairs in a directory that you can specify in the configuration files when training. For instance, you can have separate train/val/test directories, each with slp/video pairs. The naming convention is not important, as the .slp labels file has a reference to its associated video file when its created.

For microscopy tracking, you'll need to organize your data in the Cell Tracking Challenge format. We've provided a sample directory structure below. Check out this guide for more details.

Using the SLEAP format:

dataset_name/
    train/
        vid_1.{VID_EXTENSION}
        vid_1.slp
            .
            .
            .
        vid_n.{VID_EXTENSION}
        vid_n.slp
    val/
        vid_1.{VID_EXTENSION}
        vid_1.slp
            .
            .
            .
        vid_n.{VID_EXTENSION}
        vid_n.slp
    test/ # optional; test sets are not automatically evaluated as part of training
        vid_1.{VID_EXTENSION}
        vid_1.slp
            .
            .
            .
        vid_n.slp
        vid_n.slp
The CellTrackingChallenge format requires a directory with raw tifs, and a matching directory with labelled segmentation masks for the track labels. The directory structure is as follows:
dataset_name/
    train/
        subdir_0/
            frame0.tif # these are raw images
            ...
            frameN.tif
        subdir_0_GT/TRA # these are labelled segmentation masks
            frame0.tif
            ...
            frameN.tif
        subdir_1/
            frame0.tif
            ...
            frameN.tif
        subdir_1_GT/TRA
            frame0.tif
            ...
            frameN.tif
        ...
    val/
        subdir_0/
        subdir_0_GT/TRA
        subdir_1/
        subdir_1_GT/TRA
        ...
    test/ # optional; test sets are not automatically evaluated as part of training
        subdir_0/
        subdir_0_GT/TRA
        subdir_1/
        subdir_1_GT/TRA
        ...

Training

Now that you have your dataset set up, let's start training a model! We provide a CLI that allows you to train with a simple command and a single yaml configuration file.

Setup Config

The input into our training script is a .yaml file that contains all the parameters needed for training. Please see here for a detailed description of all the parameters and how to set up the config. We provide up-to-date sample configs on HuggingFace Hub in the talmolab/microscopy-pretrained and talmolab/animals-pretrained. In general, the best practice is to keep a single base.yaml file which has all the default arguments you'd like to use. Then you can have a second .yaml file which will override a specific set of parameters when training.

Train Model

Once you have your config file and dataset set up, training is as easy as running

dreem-train --config-base=[CONFIG_DIR] --config-name=[CONFIG_STEM]
where CONFIG_DIR is the directory that hydra should search for the config.yaml and CONFIG_STEM is the name of the config without the .yaml extension.

e.g. If you have a config file called base.yaml inside your /home/user/dreem_configs directory you can call

dreem-train --config-base=/home/user/dreem_configs --config-name=base

Note: you can use relative paths as well but may be a bit riskier so we recommend absolute paths whenever possible.

If you've been through the example notebooks, you'll notice that training was done using the API rather than the CLI. You can use whichever you prefer.

Overriding Arguments

Instead of changing the base.yaml file every time you want to train a model using different configurations, hydra enables us to either

  1. provide another .yaml file with a subset of the parameters to overide
  2. provide the args to the cli directly

We recommend using the file-based override for logging and reproducibility.

For overriding specific params with an override file, you can run:

dreem-train --config-base=[CONFIG_DIR] --config-name=[CONFIG_STEM] ++params_config="/path/to/override_params.yaml"

e.g. If you have a override_params.yaml file inside your /home/user/dreem_configs directory that contains a only a small selection of parameters that you'd like to override, you can run:

dreem-train --config-base=/home/user/dreem_configs --config-name=base ++params_config=/home/user/dreem_configs/override_params.yaml

Output

The output of the train script will be at least 1 ckpt file, assuming you've configured the checkpointing section of the config correctly.

Eval (with ground truth labels)

To test the performance of your model, you can use the dreem-eval CLI. It computes multi-object tracking metrics on your test data and outputs it in h5 format.

Setup data

Note that your data should have ground truth labels. You can arrange it in a /test directory as shown above.

Setup config

Samples are available at talmolab/microscopy-pretrained and talmolab/animals-pretrained, while a detailed walkthrough is available here

Run evaluation

We provide a CLI that allows you to evaluate your model with a simple command and a single yaml configuration file.

dreem-eval --config-base=[CONFIG_DIR] --config-name=[CONFIG_STEM]

Output

Tracking results will be saved as .slp in the directory specified by the outdir argument. If you don't enter an outdir in the config, it will save to ./results.

You should now be ready to use DREEM to train and track your own data!