• Corpus ID: 239768306

Perceptual Consistency in Video Segmentation

@article{Zhang2021PerceptualCI,
  title={Perceptual Consistency in Video Segmentation},
  author={Yizhe Zhang and Shubhankar Borse and Hong Cai and Ying Wang and Ning Bi and Xiaoyun Jiang and Fatih Murat Porikli},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.12385}
}
In this paper, we present a novel perceptual consistency perspective on video semantic segmentation, which can capture both temporal consistency and pixel-wise correctness. Given two nearby video frames, perceptual consistency measures how much the segmentation decisions agree with the pixel correspondences obtained via matching general perceptual features. More specifically, for each pixel in one frame, we find the most perceptually correlated pixel in the other frame. Our intuition is that… 

References

SHOWING 1-10 OF 39 REFERENCES
Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
TLDR
This paper proposes a novel framework for joint video semantic segmentation and optical flow estimation that is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference.
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation
TLDR
This work presents a new benchmark dataset and evaluation methodology for the area of video object segmentation, named DAVIS (Densely Annotated VIdeo Segmentation), and provides a comprehensive analysis of several state-of-the-art segmentation approaches using three complementary metrics.
Joint Optical Flow and Temporally Consistent Semantic Segmentation
TLDR
This paper proposes a method for jointly estimating optical flow and temporally consistent semantic segmentation, which closely connects these two problem domains and leverages each other.
Unsupervised Temporal Consistency Metric for Video Segmentation in Highly-Automated Driving
TLDR
This paper introduces a metric which does not require segmentation labels for measuring the stability of the predictions of segmentation networks over a series of images and proposes the use of the metric as either an online observer for identification of possibly unstable segmentation predictions, or as an offline method to evaluate or to improve semantic segmentation Networks.
Exploiting Temporality for Semi-Supervised Video Segmentation
TLDR
This work tackles the issue of label scarcity by using consecutive frames of a video, where only one frame is annotated, and proposes a deep, end-to-end trainable model which leverages temporal information in order to make use of easy to acquire unlabeled data.
Video Object Segmentation Using Space-Time Memory Networks
TLDR
This work proposes a novel solution for semi-supervised video object segmentation by leveraging memory networks and learning to read relevant information from all available sources to better handle the challenges such as appearance changes and occlussions.
Architecture Search of Dynamic Cells for Semantic Video Segmentation
TLDR
This work proposes a neural architecture search solution, where the choice of operations together with their sequential arrangement are being predicted by a separate neural network, and shows that such generalisation leads to stable and accurate results across common benchmarks, such as CityScapes and CamVid datasets.
Video Object Segmentation and Tracking: A Survey
TLDR
A comprehensive review of the state-of-the-art tracking methods, and classify these methods into different categories, and identify new trends is provided.
Feature Space Optimization for Semantic Video Segmentation
TLDR
An approach to long-range spatio-temporal regularization in semantic video segmentation by optimizing the mapping of pixels to a Euclidean feature space so as to minimize distances between corresponding points.
Low-Latency Video Semantic Segmentation
TLDR
A framework for video semantic segmentation is developed, which incorporates two novel components: a feature propagation module that adaptively fuses features over time via spatially variant convolution, thus reducing the cost of per-frame computation and an adaptive scheduler that dynamically allocate computation based on accuracy prediction.
...
1
2
3
4
...