Track 2: Object Detection in Poor Visibility Environments

About the Dataset

We structure this track into three sub-challenges. Each challenge features a different poor-visibility outdoor condition, and diverse training protocols (paired versus unpaired images, annotated versus unannotated, etc.).

Training & Evaluation

In all three sub-challenges, the participant teams are allowed to use external training data that are not mentioned above, including self-synthesized or self-collected data; but they must state so in their submissions. The ranking criteria will be the Mean average precision (mAP) on each testing set, with Interception-of-Union (IoU) threshold as 0.5. If the ratio of the intersection of a detected region with an annotated face region is greater than 0.5, a score of 1 is assigned to the detected region, and 0 otherwise. When mAPs with IoU as 0.5 are equal, the mAPs with higher IoUs (0.6, 0.7, 0.8) will be compared sequentially.

Sub-Challenge 2.1: (Semi-)Supervised Object Detection in the Haze

We provide a set of 4,322 real-world hazy images collected from traffic surveillance, all labeled with object bounding boxes and categories (car, bus, bicycle, motorcycle, pedestrian), as the main training and/or validation sets. We also release another set of 4,807 unannotated real-world hazy images collected from the same sources (and containing the same classes of traffic objects, though not annotated), which might be used at the participants’ discretization. There will be a hold-out testing set of 3,000 real-world hazy images, with the same classes of objected annotated.

Sub-Challenge 2.2: (Semi-)Supervised Face Detection in the Low Light Condition

We provide 6,000 real-world low light images captured during the nighttime, at teaching buildings, streets, bridges, overpasses, parks etc., all labeled with bounding boxes for of human face, as the main training and/or validation sets. We also provide 9,000 unlabeled low-light images collected from the same setting. Additionally, we provided a unique set of 1,000 paired low-light / normal-light images captured in controllable real lighting conditions (but unnecessarily containing faces), which can be used as parts of the training data at the participants’ discretization. There will be a hold-out testing set of 4,000 low-light images, with human face bounding boxes annotated.

Sub-Challenge 2.3: Zero-Shot Object Detection with Raindrop Occlusions

We provide 1,010 pairs of raindrop images and corresponding clean ground-truths (collected through physical simulations), as the training and/or validation sets. Different from Sub-Challenges 2.1 and 2.2, no semantic annotation will be available on training/validation images. A hold-out testing set of 2, 496 real-world raindrop images are collected from high-resolution driving videos, in diverse real locations and scenes during multiple drives. We label bounding boxes for selected traffic object categories: car, person, bus, bicycle, and motorcycle.

If you have any questions about this challenge track please feel free to email