Mask R-CNN

@article{He2017MaskR,
  title={Mask R-CNN},
  author={Kaiming He and Georgia Gkioxari and Piotr Doll{\'a}r and Ross B. Girshick},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={2980-2988}
}
We present a conceptually simple, flexible, and general framework for object instance segmentation. [] Key Method The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top…
Volume R-CNN: Unified Framework for CT Object Detection and Instance Segmentation
TLDR
Volume R-CNN is an end-to-end method that could perform region proposal, classification and instance segmentation all in one model, which dramatically reduces computational overhead and parameter numbers.
Mask R-CNN
TLDR
This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.
Boundary-preserving Mask R-CNN
TLDR
A conceptually simple yet effective Boundary-preserving Mask R-CNN (BMask R- CNN) to leverage object boundary information to improve mask localization accuracy in instance segmentation.
Cascade Mask Generation Framework for Fast Small Object Detection
TLDR
The proposed cascade mask generation framework takes in multi-scale images as input and processes them in ascending order of the scale, producing object proposals as well as a region-of-interest (RoI) mask for the next stage.
TensorMask: A Foundation for Dense Object Segmentation
TLDR
It is demonstrated that the tensor view leads to large gains over baselines that ignore this structure, and leads to results comparable to Mask R-CNN, suggesting that TensorMask can serve as a foundation for novel advances in dense mask prediction and a more complete understanding of the task.
Mask SSD: An Effective Single-Stage Approach to Object Instance Segmentation
TLDR
Experimental results verify that the proposed Mask SSD method has a comparable precision with less speed overhead as compared with state-of-the-art approaches.
Shape-aware Feature Extraction for Instance Segmentation
TLDR
A new region- of-interest (RoI) feature extraction strategy, named Shape-aware RoIAlign, which focuses feature extraction within a region aligned well with the shape of the instance-of-interest rather than a rectangular RoI.
Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head
  • Feng Ni, Yuehan Yao
  • Computer Science
    2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)
  • 2019
TLDR
A unified head module named EJ-Head (Effective Joint Head) is proposed to combine two branches into one head, not only realizing the interaction between two tasks, but also enhancing the effectiveness of multi-task learning.
Instance Segmentation with Point Supervision
TLDR
The method obtains competitive results compared to fully-supervised methods in certain scenarios; outperforms fully- and weakly- supervised methods with a fixed annotation budget; and is a first strong baseline for instance segmentation with point-level supervision.
Frustratingly Easy Trade-off Optimization Between Single-Stage and Two-Stage Deep Object Detectors
TLDR
Four simple and straightforward approaches to achieve an optimal trade-off between accuracy and speed in object detection are proposed and evaluated, each based on an image difficulty predictor, and each employs a faster single-stage detector to determine the approximate number of objects and their sizes.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
Simultaneous Detection and Segmentation
TLDR
This work builds on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN), introducing a novel architecture tailored for SDS, and uses category-specific, top-down figure-ground predictions to refine the bottom-up proposals.
Shape-aware Instance Segmentation
TLDR
This paper introduces a novel object segment representation based on the distance transform of the object masks, and designs an object mask network (OMN) with a new residual-deconvolution architecture that infers such a representation and decodes it into the final binary object mask.
R-FCN: Object Detection via Region-based Fully Convolutional Networks
TLDR
This work presents region-based, fully convolutional networks for accurate and efficient object detection, and proposes position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TLDR
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Feature Pyramid Networks for Object Detection
TLDR
This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
Learning to Segment Object Candidates
TLDR
A new way to generate object proposals is proposed, introducing an approach based on a discriminative convolutional network that obtains substantially higher object recall using fewer proposals and is able to generalize to unseen categories it has not seen during training.
Learning to Refine Object Segments
TLDR
This work proposes to augment feedforward nets for object segmentation with a novel top-down refinement approach that is capable of efficiently generating high-fidelity object masks and is 50 % faster than the original DeepMask network.
Convolutional feature masking for joint object and stuff segmentation
  • Jifeng Dai, Kaiming He, Jian Sun
  • Computer Science
    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
TLDR
This paper proposes a joint method to handle objects and “stuff” (e.g., grass, sky, water) in the same framework and presents state-of-the-art results on benchmarks of PASCAL VOC and new PASCal-CONTEXT.
Instance-Sensitive Fully Convolutional Networks
TLDR
This paper develops FCNs that are capable of proposing instance-level segment candidates that do not have any high-dimensional layer related to the mask resolution, but instead exploits image local coherence for estimating instances.
...
1
2
3
4
5
...