HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images

  title={HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images},
  author={Johan Vertens and Jannik Z{\"u}rn and Wolfram Burgard},
  journal={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
The majority of learning-based semantic segmentation methods are optimized for daytime scenarios and favorable lighting conditions. Real-world driving scenarios, however, entail adverse environmental conditions such as nighttime illumination or glare which remain a challenge for existing approaches. In this work, we propose a multimodal semantic segmentation model that can be applied during daytime and nighttime. To this end, besides RGB images, we leverage thermal images, making our network… 

Figures and Tables from this paper

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

This paper proposes a novel domain adaptation network (DANNet) for nighttime semantic segmentation without using labeled nighttime image data, which employs an adversarial training with a labeled daytime dataset and an unlabeled dataset that contains coarsely aligned day-night image pairs.

Cross-Domain Correlation Distillation for Unsupervised Domain Adaptation in Nighttime Semantic Segmentation

This work proposes a novel domain adaptation framework via cross-domain correlation distillation, called CCDistill, which achieves the state-of-the-art performance for nighttime semantic segmentation and is a one-stage domain adaptation network which can avoid affecting the inference time.

Unsupervised RGB-to-Thermal Domain Adaptation via Multi-Domain Attention Network

This work presents a new method for unsupervised thermal image classing and semantic segmentation by transferring knowledge from the RGB domain using a multi-domain attention network, outperforms the state-of-the-art RGB-to-thermal adaptation method in classification benchmarks, and is successfully applied to thermal river scene segmentation using only synthetic RGB images.

NLFNet: Non-Local Fusion Towards Generalized Multimodal Semantic Segmentation across RGB-Depth, Polarization, and Thermal Images

Non-Local Fusion Network (NLFNet) is proposed, which is a semantic segmentation network that can selectively fuse multimodal input information in an adaptive manner and improves the segmentation accuracy of the network and solves the problem of object recognition in various challenging real-world scenes.

An Unsupervised Domain Adaptive Approach for Multimodal 2D Object Detection in Adverse Weather Conditions

This work proposes an unsupervised domain adaptation framework, which adapts a 2D object detector for RGB and lidar sensors to one or more target domains featuring adverse weather conditions and leverages the complementary features of multiple modalities through a multi-scale entropy-weighted domain discriminator.

Zero-Shot Day-Night Domain Adaptation with a Physics Prior

Improved performance for zero-shot day to night domain adaptation is demonstrated on both synthetic as well as natural datasets in various tasks, including classification, segmentation and place recognition.

Zero-Shot Domain Adaptation with a Physics Prior

Improved performance for zero-shot day to night domain adaptation is demonstrated on both synthetic as well as natural datasets in various tasks, including classification, segmentation and place recognition.

Semantic Segmentation for Thermal Images: A Comparative Survey

  • Zülfiye KütükG. Algan
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2022
This work aims to fill a comprehensive survey centered explicitly around semantic segmentation using infrared spectrum by presenting algorithms in the literature and categorizing them by their input images.

Seeing BDD100K in dark: Single-Stage Night-time Object Detection via Continual Fourier Contrastive Learning

A novel technique for enhancing the object detector via Contrastive Learning, which tries to group together embeddings of similar images, and achieves state-of-the-art performance on the large scale BDD100K dataset, in an uniform setting.

Polarization-driven Semantic Segmentation via Efficient Attention-bridged Fusion

This work presents EAFNet, an Efficient Attention-bridged Fusion Network, to exploit complementary information coming from different optical sensors, and incorporates polarization sensing to obtain supplementary information, considering its optical characteristics for robust representation of diverse materials.



Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

This work addresses the problem of semantic segmentation of nighttime images and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations, and designs a new evaluation framework to address the substantial uncertainty of semantics in nighttime images.

Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime

  • Dengxin DaiL. Gool
  • Computer Science, Environmental Science
    2018 21st International Conference on Intelligent Transportation Systems (ITSC)
  • 2018
A novel method to progressive adapt the semantic models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time, to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions.

Segmenting Objects in Day and Night: Edge-Conditioned CNN for Thermal Image Semantic Segmentation

This article elaborately design a gated featurewise transform layer in EC-CNN to adaptively incorporate edge prior knowledge and introduces a new benchmark data set named “Segmenting Objects in Day and night” (SODA) for comprehensive evaluations in thermal image semantic segmentation.

See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion

A framework to alleviate the accuracy decline when semantic segmentation is taken to adverse conditions by using Generative Adversarial Networks (GANs) and shows that the performance varies with respect to the proportion of synthetic nighttime images in the dataset, where the sweet spot corresponds to most robust performance across the day and night.

Bridging the Day and Night Domain Gap for Semantic Segmentation

Diverse options such as enlarging the dataset to cover these domains in unsupervised training or adapting the images on-the-fly during inference to a comfortable domain such as sunny daylight in a pre-processing step are explored, allowing IV perception systems to work reliably also at night.

RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes

This work takes the advantage of thermal images and fuse both the RGB and thermal information in a novel deep neural network that outperforms the state of the arts in semantic segmentation for autonomous vehicles.

MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes

This work addresses the semantic segmentation of images of street scenes for autonomous vehicles based on a new RGB-Thermal dataset, which is introduced in this paper and showed that the segmentation accuracy was significantly increased by adding thermal infrared information.

Deep Multispectral Semantic Scene Understanding of Forested Environments Using Multimodal Fusion

This paper introduces a first-of-its-kind multispectral segmentation benchmark that contains 15, 000 images and 366 pixel-wise ground truth annotations of unstructured forest environments and identifies new data augmentation strategies that enable training of very deep models using relatively small datasets.

AdapNet: Adaptive semantic segmentation in adverse environmental conditions

This paper proposes a novel semantic segmentation architecture and the convoluted mixture of deep experts (CMoDE) fusion technique that enables a multi-stream deep neural network to learn features from complementary modalities and spectra, each of which are specialized in a subset of the input space.

Learning to Adapt Structured Output Space for Semantic Segmentation

A multi-level adversarial network is constructed to effectively perform output space domain adaptation at different feature levels and it is shown that the proposed method performs favorably against the state-of-the-art methods in terms of accuracy and visual quality.