HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images

@article{Vertens2020HeatNetBT,
  title={HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images},
  author={Johan Vertens and Jannik Z{\"u}rn and Wolfram Burgard},
  journal={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2020},
  pages={8461-8468}
}
The majority of learning-based semantic segmentation methods are optimized for daytime scenarios and favorable lighting conditions. Real-world driving scenarios, however, entail adverse environmental conditions such as nighttime illumination or glare which remain a challenge for existing approaches. In this work, we propose a multimodal semantic segmentation model that can be applied during daytime and nighttime. To this end, besides RGB images, we leverage thermal images, making our network… 

Figures and Tables from this paper

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

TLDR
This paper proposes a novel domain adaptation network (DANNet) for nighttime semantic segmentation without using labeled nighttime image data, which employs an adversarial training with a labeled daytime dataset and an unlabeled dataset that contains coarsely aligned day-night image pairs.

Cross-Domain Correlation Distillation for Unsupervised Domain Adaptation in Nighttime Semantic Segmentation

TLDR
This work proposes a novel domain adaptation framework via cross-domain correlation distillation, called CCDistill, which achieves the state-of-the-art performance for nighttime semantic segmentation and is a one-stage domain adaptation network which can avoid affecting the inference time.

An Unsupervised Domain Adaptive Approach for Multimodal 2D Object Detection in Adverse Weather Conditions

TLDR
This work proposes an unsupervised domain adaptation framework, which adapts a 2D object detector for RGB and lidar sensors to one or more target domains featuring adverse weather conditions and leverages the complementary features of multiple modalities through a multi-scale entropy-weighted domain discriminator.

Zero-Shot Day-Night Domain Adaptation with a Physics Prior

TLDR
Improved performance for zero-shot day to night domain adaptation is demonstrated on both synthetic as well as natural datasets in various tasks, including classification, segmentation and place recognition.

Zero-Shot Domain Adaptation with a Physics Prior

TLDR
Improved performance for zero-shot day to night domain adaptation is demonstrated on both synthetic as well as natural datasets in various tasks, including classification, segmentation and place recognition.

Semantic Segmentation for Thermal Images: A Comparative Survey

  • Zülfiye KütükG. Algan
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2022
TLDR
This work aims to fill a comprehensive survey centered explicitly around semantic segmentation using infrared spectrum by presenting algorithms in the literature and categorizing them by their input images.

Seeing BDD100K in dark: Single-Stage Night-time Object Detection via Continual Fourier Contrastive Learning

TLDR
A novel technique for enhancing the object detector via Contrastive Learning, which tries to group together embeddings of similar images, and achieves state-of-the-art performance on the large scale BDD100K dataset, in an uniform setting.

Polarization-driven Semantic Segmentation via Efficient Attention-bridged Fusion

TLDR
This work presents EAFNet, an Efficient Attention-bridged Fusion Network, to exploit complementary information coming from different optical sensors, and incorporates polarization sensing to obtain supplementary information, considering its optical characteristics for robust representation of diverse materials.

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

TLDR
This work presents the novel self-supervised MM-DistillNet framework consisting of multiple teachers that leverage diverse modalities including RGB, depth and thermal images, to simultaneously exploit complementary cues and distill knowledge into a single audio student network, and proposes the new MTA loss function that facilitates the distillation of information from multimodal teachers in a self- supervised manner.

References

SHOWING 1-10 OF 37 REFERENCES

Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

TLDR
This work addresses the problem of semantic segmentation of nighttime images and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations, and designs a new evaluation framework to address the substantial uncertainty of semantics in nighttime images.

Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime

  • Dengxin DaiL. Gool
  • Computer Science, Environmental Science
    2018 21st International Conference on Intelligent Transportation Systems (ITSC)
  • 2018
TLDR
A novel method to progressive adapt the semantic models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time, to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions.

Segmenting Objects in Day and Night: Edge-Conditioned CNN for Thermal Image Semantic Segmentation

TLDR
This article elaborately design a gated featurewise transform layer in EC-CNN to adaptively incorporate edge prior knowledge and introduces a new benchmark data set named “Segmenting Objects in Day and night” (SODA) for comprehensive evaluations in thermal image semantic segmentation.

See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion

TLDR
A framework to alleviate the accuracy decline when semantic segmentation is taken to adverse conditions by using Generative Adversarial Networks (GANs) and shows that the performance varies with respect to the proportion of synthetic nighttime images in the dataset, where the sweet spot corresponds to most robust performance across the day and night.

Bridging the Day and Night Domain Gap for Semantic Segmentation

TLDR
Diverse options such as enlarging the dataset to cover these domains in unsupervised training or adapting the images on-the-fly during inference to a comfortable domain such as sunny daylight in a pre-processing step are explored, allowing IV perception systems to work reliably also at night.

RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes

TLDR
This work takes the advantage of thermal images and fuse both the RGB and thermal information in a novel deep neural network that outperforms the state of the arts in semantic segmentation for autonomous vehicles.

MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes

TLDR
This work addresses the semantic segmentation of images of street scenes for autonomous vehicles based on a new RGB-Thermal dataset, which is introduced in this paper and showed that the segmentation accuracy was significantly increased by adding thermal infrared information.

Deep Multispectral Semantic Scene Understanding of Forested Environments Using Multimodal Fusion

TLDR
This paper introduces a first-of-its-kind multispectral segmentation benchmark that contains 15, 000 images and 366 pixel-wise ground truth annotations of unstructured forest environments and identifies new data augmentation strategies that enable training of very deep models using relatively small datasets.

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes

TLDR
This work proposes a curriculum-style learning approach to minimize the domain gap in semantic segmentation, which significantly outperforms the baselines as well as the only known existing approach to the same problem.

AdapNet: Adaptive semantic segmentation in adverse environmental conditions

TLDR
This paper proposes a novel semantic segmentation architecture and the convoluted mixture of deep experts (CMoDE) fusion technique that enables a multi-stream deep neural network to learn features from complementary modalities and spectra, each of which are specialized in a subset of the input space.