SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

@inproceedings{Cao2020SipMaskSI,
  title={SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation},
  author={Jiale Cao and Rao Muhammad Anwer and Hisham Cholakkal and F. Khan and Yanwei Pang and L. Shao},
  booktitle={ECCV},
  year={2020}
}
Single-stage instance segmentation approaches have recently gained popularity due to their speed and simplicity, but are still lagging behind in accuracy, compared to two-stage methods. We propose a fast single-stage instance segmentation method, called SipMask, that preserves instance-specific spatial information by separating mask prediction of an instance to different sub-regions of a detected bounding-box. Our main contribution is a novel light-weight spatial preservation (SP) module that… Expand
Video Instance Segmentation with a Propose-Reduce Paradigm
TLDR
This work proposes a new paradigm – Propose-Reduce, to generate complete sequences for input videos by a single step, and builds a sequence propagation head on the existing imagelevel instance segmentation network for long-term propagation. Expand
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation
TLDR
This work proposes a novel comprehensive feature aggregation approach (CompFeat) to refine features at both frame-level and object-level with temporal and spatial context information to eliminate ambiguities introduced by only using single-frame features. Expand
MSN: Efficient Online Mask Selection Network for Video Instance Segmentation
TLDR
This work presents a novel solution for Video Instance Segmentation (VIS), that is automatically generating instance level segmentation masks along with object class and tracking them in a video using the Mask Selection Network (MSN). Expand
QueryInst: Parallelly Supervised Mask Query for Instance Segmentation
Recently, query based object detection frameworks achieve comparable performance with previous state-ofthe-art object detectors. However, how to fully leverage such frameworks to perform instanceExpand
BoundarySqueeze: Image Segmentation as Boundary Squeezing
TLDR
The proposed Boundary Squeeze module is a novel and efficient module that squeezes the object boundary from both inner and outer directions which leads to precise mask representation and outperforms previous state-of-theart PointRend in both accuracy and speed under the same setting. Expand
Instance Sequence Queries for Video Instance Segmentation with Transformers
TLDR
This work uses a set of queries, called instance sequence queries (ISQs), to drive the transformer decoder and produce results at each frame, and extends the bipartite matching loss to two frames, so there is no need for complex data association. Expand
Improving Video Instance Segmentation via Temporal Pyramid Routing
  • Xiangtai Li, Hao He, +4 authors Yunhai Tong
  • Computer Science
  • ArXiv
  • 2021
TLDR
A Temporal Pyramid Routing strategy to conditionally align and conduct pixel-level aggregation from a feature pyramid pair of two adjacent frames is proposed, which is a plug-and-play module and can be easily applied to existing instance segmentation methods. Expand
TF-Blender: Temporal Feature Blender for Video Object Detection
  • Yiming Cui, Liqi Yan, Zhiwen Cao, Dongfang Liu
  • Computer Science
  • ArXiv
  • 2021
TLDR
A novel solution named TF-Blender, which includes three modules: Temporal relation models the relations between the current frame and its neighboring frames to preserve spatial information, feature adjustment enriches the representation of every neighboring feature map, and Feature blender combines outputs from the first two modules and produces stronger features for the later detection tasks. Expand
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
TLDR
Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation, is proposed, which first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. Expand
Instances as Queries
TLDR
This paper presents QueryInst (Instances as Queries), a query based instance segmentation method driven by parallel supervision on dynamic mask heads that achieves the best performance among all online VIS approaches and strikes a decent speed-accuracy trade-off. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 54 REFERENCES
PolarMask: Single Shot Instance Segmentation With Polar Representation
  • Enze Xie, Pei Sun, +5 authors Ping Luo
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
In this paper, we introduce an anchor-box free and single shot instance segmentation method, which is conceptually simple, fully convolutional and can be used by easily embedding it into mostExpand
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
TLDR
The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference. Expand
SSAP: Single-Shot Instance Segmentation With Affinity Pyramid
TLDR
This work proposes a single-shot proposal-free instance segmentation method that requires only one single pass for prediction, based on a pixel-pair affinity pyramid, which computes the probability that two pixels belong to the same instance in a hierarchical manner. Expand
InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting
TLDR
This paper presents a simple, efficient and effective method to augment the training set using the existing instance mask annotations, and proposes a location probability map based approach to explore the feasible locations that objects can be placed based on local appearance similarity. Expand
D2Det: Towards High Quality Object Detection and Instance Segmentation
TLDR
A novel two-stage detection method, D2Det, that collectively addresses both precise localization and accurate classification is proposed and a discriminative RoI pooling scheme that samples from various sub-regions of a proposal and performs adaptive weighting to obtain discriminating features is introduced. Expand
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth
TLDR
This work proposes a new clustering loss function for proposal-free instance segmentation that pulls the spatial embeddings of pixels belonging to the same instance together and jointly learns an instance-specific clustering bandwidth, maximizing the intersection-over-union of the resulting instance mask. Expand
RDSNet: A New Deep Architecture for Reciprocal Object Detection and Instance Segmentation
TLDR
RDSNet is presented, a novel deep architecture for reciprocal object detection and instance segmentation that combines a correlation module and a cropping module to yield instance masks, as well as a mask based boundary refinement module for more accurate bounding boxes. Expand
FCOS: Fully Convolutional One-Stage Object Detection
TLDR
For the first time, a much simpler and flexible detection framework achieving improved detection accuracy is demonstrated, and it is hoped that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks. Expand
TensorMask: A Foundation for Dense Object Segmentation
TLDR
It is demonstrated that the tensor view leads to large gains over baselines that ignore this structure, and leads to results comparable to Mask R-CNN, suggesting that TensorMask can serve as a foundation for novel advances in dense mask prediction and a more complete understanding of the task. Expand
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
TLDR
This work presents a model, called MaskLab, which produces three outputs: box detection, semantic segmentation, and direction prediction, which is evaluated on the COCO instance segmentation benchmark and shows comparable performance with other state-of-art models. Expand
...
1
2
3
4
5
...