The Fishyscapes Benchmark

Anomaly Detection for Semantic Segmentation

Real Captured Data

captured with the same setup as Cityscapes

We evaluate methods on our dense annotations of image data originally captured for the Lost and Found dataset.

Open World Benchmark

continuously changing anomaly objects

To test methods in open-world conditions, they are evaluated against random anomalies from the web that change every few months.

Stay in Touch

Subscribe with your email to receive updates

Safe Deployment of Deep Learning on Robots

Research has produced Deep Learning methods that are increasingly accurate and start to generalise over illumination changes etc. However, modern networks are also known to be overconfident when exposed to anomalous or novel inputs.

The figure shows a prediction of DeepLabv3+, one of the leading methods for semantic segmentation in benchmarks like cityscapes, mapillary or the Robust Driving Benchmark. While the sheep does not fit into the set of classes it has been trained on, it very confidently assigns the classes street, human or sidewalk.

The Fishyscapes Benchmark compares research approaches towards detecting anomalies in the input. It therefore bridges another gap towards deploying learning systems on autonomous systems, that by definition have to deal with unexpected inputs and anomalies.

Anomaly Detection for Semantic Segmentation

Anomaly detection and uncertainty estimation from deep learning models is subject of active research. However, fundamental ML research is often evaluated on MNIST, CIFAR and similar datasets. In the Fishyscapes Benchmark, we test how such methods transfer onto the much more complex task of semantic segmentation.


To process your submissions, our team is dedicated to help you get the methods running on our cluster. Please contact us in case of any questions.

Sun Boyang

Xing Jiaxu

Hermann Blum