• Corpus ID: 196831964

Real-time Hair Segmentation and Recoloring on Mobile GPUs

  title={Real-time Hair Segmentation and Recoloring on Mobile GPUs},
  author={Andrei Tkachenka and Gregory Karpiak and Andrey Vakunov and Yury Kartynnik and Artsiom Ablavatski and Valentin Bazarevsky and Siargey Pisarchyk},
We present a novel approach for neural network-based hair segmentation from a single camera input specifically designed for real-time, mobile application. Our relatively small neural network produces a high-quality hair segmentation mask that is well suited for AR effects, e.g. virtual hair recoloring. The proposed model achieves real-time inference speed on mobile GPUs (30-100+ FPS, depending on the device) with high accuracy. We also propose a very realistic hair recoloring scheme. Our method… 

Figures and Tables from this paper

Real-Time Hair Segmentation Using Mobile-Unet

A real-time hair segmentation method based on a fully convolutional network with the basic structure of an encoder–decoder that can obtain hair information, which has a significant impact on human-robot interaction with people.

An optimized mask-guided mobile pedestrian detection network with millisecond scale

A super lightweight network based on video, which made the following two major contributions: for each frame, the detection boxes of the previous frame and the RGB channels of the current frame are jointly used as the input of the network, which can improve the detection accuracy effectively with little increase in the network complexity.

Framework to Computationally Analyze Kathakali Videos

A framework to analyze the facial expressions of the actors and novel visualization techniques for the same are presented and a novel application of style-transfer of Kathakali video onto a cartoonized face is presented.

Learning Illumination from Diverse Portraits

This work presents a learning-based technique for estimating high dynamic range, omnidirectional illumination from a single low dynamic range portrait image captured under arbitrary indoor or outdoor lighting conditions, and shows that this technique outperforms the state-of-the-art technique for portrait-based lighting estimation.

Real‐time Virtual‐Try‐On from a Single Example Image through Deep Inverse Graphics and Learned Differentiable Renderers

This paper proposes a novel framework based on deep learning to build a real‐time inverse graphics encoder that learns to map a single example image into the parameter space of a given augmented reality rendering engine, and introduces a trainable imitator module.

FireNet: Real-time Segmentation of Fire Perimeter from Aerial Video

This paper shares the approach to real-time segmentation of fire perimeter from aerial full-motion infrared video from a humanitarian aid and disaster response perspective and explains the importance of the problem, how it is currently resolved, and how the machine learning approach improves it.

Latents2Segments: Disentangling the Latent Space of Generative Models for Semantic Segmentation of Face Images

The endeavour in this work is to do away with the priors and complex pre-processing operations required by SOTA multi-class face segmentation models by reframing this operation as a downstream task post infusion of disentanglement with respect to facial semantic regions of interest (ROIs) in the latent space of a Generative Autoencoder model.

Correcting Face Distortion in Wide-Angle Videos

A video warping algorithm to apply stereographic projection locally on the facial regions using spatial-temporal energy minimization and minimize background deformation using a line-preservation term to maintain the straight edges in the background.

FaceAtlasAR: Atlas of Facial Acupuncture Points in Augmented Reality

This project proposed a system to localize and visualize facial acupoints for individuals in an augmented reality (AR) context that combines a face alignment model and a hair segmentation model to provide dense reference points for ac upoints localization in real-time (60FPS).

E-faceatlasAR: extend atlas of facial acupuncture points with auricular maps in augmented reality for self-acupressure

E-FaceAtlasAR is presented, which extends Face atlasAR to show auricular zone maps in the meantime, and adopts Mediapipe, a cross-platform machine learning framework, to build the real-time pipeline that runs on desktop and Android phones.



U-Net: Convolutional Networks for Biomedical Image Segmentation

It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

Densely Connected Convolutional Networks

The Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion, and has several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size

This work proposes a small DNN architecture called SqueezeNet, which achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters and is able to compress to less than 0.5MB (510x smaller than AlexNet).

Long Short-Term Memory

A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.