MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition

  title={MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition},
  author={Zhanghan Ke and Jiayu Sun and Kaican Li and Qiong Yan and Rynson W. H. Lau},
  booktitle={AAAI Conference on Artificial Intelligence},
Existing portrait matting methods either require auxiliary inputs that are costly to obtain or involve multiple stages that are computationally expensive, making them less suitable for real-time applications. In this work, we present a light-weight matting objective decomposition network (MODNet) for portrait matting in real-time with a single input image. The key idea behind our efficient design is by optimizing a series of sub-objectives simultaneously via explicit constraints. In addition… 

Learning to Relight Portrait Images via a Virtual Light Stage and Synthetic-to-Real Adaptation

This work proposes a new approach that can perform on par with the state-of-the-art (SOTA) relighting methods without requiring a light stage, and develops a novel synthetic-to-real approach to bring photorealism to the relighting network output.

One-Trimap Video Matting

This paper proposes One-Trimap Video Matting network (OTVM) that performs video matting robustly using only one user-annotated trimap, and greatly improves the temporal stability of trimap propagation compared to the previous decoupled methods.

Foreground–background decoupling matting

A novel Foreground–Background Decoupling Matting (FBDM) network motivated by the new subtasks, and extensive experiments demonstrate that the proposed FBDM generates the best results compared with the state‐of‐the‐art trimap‐free methods.

Referring Image Matting

This paper establishes the first large-scale challenging dataset RefMatte by designing a comprehensive image composition and expression generation engine to produce synthetic images on top of current public high-quality matting foregrounds with flexible logics and re-labelled diverse attributes.

Deep Gradient Learning for Efficient Camouflaged Object Detection

The proposed DGNet performs well in the polyp segmentation, defect detection, and transparent object segmentation tasks, and achieves comparable results to the cutting-edge model JCSOD-CVPR21 with only 6.82% parameters.

Robust Human Matting via Semantic Guidance

This work develops a fast yet accurate human matting framework, named Semantic Guided Human Matting ( SGHM), which builds on a semantic human segmentation network and introduces a light-weight matting module with only marginal computational cost.

KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

This work investigates common issues with existing spatial encodings and proposes a simple yet highly effective approach to modeling high-fidelity volumetric avatars from sparse views to encode relative spatial 3D information via sparse 3D keypoints, robust to the sparsity of viewpoints and cross-dataset domain gap.

Harmonizer: Learning to Perform White-Box Image and Video Harmonization

This work frames image harmonization as an image-level regression problem to learn the arguments of the filters that humans use for the task, and presents a Harmonizer framework, which surpasses existing methods notably, especially with high-resolution inputs.

Infusing Definiteness into Randomness: Rethinking Composition Styles for Deep Image Matting

A novel composition style is introduced that binds the source and combined foregrounds in a definite triplet and shows that different orders of foreground combination lead to different foreground patterns, which further inspires a quadruplet-based composition style.

Fusing Global and Local Features for Generalized AI-Synthesized Image Detection

A two-branch model is designed to combine global spatial information from the whole image and local informative features from multiple patches selected by a novel patch selection module to improve the generalization ability of AI-synthesized image detection.



Disentangled Image Matting

This paper proposes AdaMatting, a new end-to-end matting framework that disentangles this problem into two sub-tasks: trimap adaptation and alpha estimation, which achieves the state-of-the-art performance on Adobe Composition-1k dataset both qualitatively and quantitatively.

Active Matting

The intrinsic relationship between the user input and the matting algorithm to address the problem of where and when the user should provide the input is explored and the most informative sequence of regions for user input is discovered in order to produce a good alpha matte with minimum labeling efforts.

Fast Deep Matting for Portrait Animation on Mobile Phone

A real-time automatic deep matting approach for mobile devices based on densely connected blocks and the dilated convolution, which is designed to predict a coarse binary mask for portrait image and an automatic portrait animation system built on mobile devices, which does not need any interaction and can realize real- time matting with 15 fps.

Attention-Guided Hierarchical Structure Aggregation for Image Matting

An end-to-end Hierarchical Attention Matting Network (HAttMatting), which can predict the better structure of alpha mattes from single RGB images without additional input, and introduces a hybrid loss function fusing Structural SIMilarity, Mean Square Error and Adversarial loss to guide the network to further improve the overall foreground structure.

Shared Sampling for Real‐Time Alpha Matting

The first real‐time alpha matting technique for natural images and videos is presented, based on the observation that, for small neighborhoods, pixels tend to share similar attributes, and achieves speedups of up to two orders of magnitude compared to previous ones, while producing high‐quality alpha mattes.

Semantic Human Matting

Semantic Human Matting (SHM) is the first algorithm that learns to jointly fit both semantic information and high quality details with deep networks and achieves comparable results with state-of-the-art interactive matting methods.

Natural Image Matting via Guided Contextual Attention

This work develops a novel end-to-end approach for natural image matting with a guided contextual attention module, which is specifically designed for image matts and can mimic information flow of affinity-based methods and utilize rich features learned by deep neural networks simultaneously.

Learning-Based Sampling for Natural Image Matting

The estimation of the layer colors through the use of deep neural networks prior to the opacity estimation is a better match for the capabilities of neural networks, and the availability of these colors substantially increase the performance of opacity estimation due to the reduced number of unknowns in the compositing equation.

Deep Automatic Portrait Matting

An automatic image matting method for portrait images that does not need user interaction is proposed and achieves comparable results with state-of-the-art methods that require specified foreground and background regions or pixels.

KNN Matting

The matting technique, aptly called KNN matting, capitalizes on the nonlocal principle by using K nearest neighbors (KNN) in matching nonlocal neighborhoods, and contributes a simple and fast algorithm giving competitive results with sparse user markups.