Efficient Multi-task Uncertainties for Joint Semantic Segmentation and Monocular Depth Estimation

Apr 24, 2025·

Dr. Steven Landgraf

Markus Hillemann

Theodor Kapler

Markus Ulrich

· 0 min read

PDF Cite Project Website

Image credit: Dr. Steven Landgraf

Abstract

Quantifying the predictive uncertainty emerged as a possible solution to common challenges like overconfidence, lack of explainability, and robustness of deep neural networks, albeit one that is often computationally expensive. Many real-world applications are multi-modal in nature and hence benefit from multi-task learning. In autonomous driving or robotics, for example, the joint solution of semantic segmentation and monocular depth estimation has proven to be valuable. To this end, we introduce EMUFormer, a novel student-teacher distillation approach for efficient multi-task uncertainties in the context of joint semantic segmentation and monocular depth estimation. By leveraging the predictive uncertainties of the teacher, EMUFormer achieves new state-of-the-art results on Cityscapes and NYUv2 and additionally estimates high-quality predictive uncertainties for both tasks that are comparable or superior to a Deep Ensemble despite being an order of magnitude more efficient.

Type

Conference paper

Publication

In DAGM German Conference on Pattern Recognition

Last updated on Apr 24, 2025

Deep Learning Semantic Segmentation Monocular Depth Estimation Uncertainty Quantification Vision Transformer

Authors

Dr. Steven Landgraf

Research Scientist (PostDoc)

← Rethinking Semi-supervised Segmentation Beyond Accuracy: Reliability and Robustness Jun 6, 2025

A Comparative Study on Multi-task Uncertainty Quantification in Semantic Segmentation and Monocular Depth Estimation Apr 15, 2025 →