The UG2 Dataset

Track 1: Video object classification and detection from unconstrained mobility platforms

About the Dataset

UG2 contains three difficult real-world scenarios: uncontrolled videos taken by UAVs and manned gliders, as well as controlled videos taken on the ground. Over 150,000 annotated frames for hundreds of ImageNet classes are available.


The data we release can be used for training and validation purposes. With respect to restoration and enhancement approaches that must be trained, we encourage a cross-dataset protocol where some annotated training data could come from outside UG2. Additionally we encourage participants to make use of their own data or data from other sources for training. However, un-annotated videos for additional validation purposes and parameter tuning are provided.

Data Annotations

The dataset contains annotations for 162,136 object-level annotated images. Bounding boxes establishing object regions were manually annotated using the VATIC Video Annotation Tool, we provide the VATIC annotation files for every annotated video in the dataset.

Each annotation file follows the annotation structure provided by VATIC. Each line contains one object annotation which is defined by 10 columns. The definition of each column is as follows:

  1. Track ID. All rows with the same ID belong to the same path of the same object through different video frames.
  2. xmin. The top left x-coordinate of the bounding box.
  3. ymin. The top left y-coordinate of the bounding box. xmax. The bottom right x-coordinate of the bounding box.
  4. ymax. The bottom right y-coordinate of the bounding box.
  5. frame. The frame that this annotation represents.
  6. lost. If 1, the annotation is outside of the view screen. In this case we did not extract any cropped region.
  7. occluded. If 1, the annotation is occluded. In this case we did not extract any cropped region.
  8. generated. If 1, the annotation was automatically interpolated. label. The class for this annotation, enclosed in quotation marks.