Corpus ID: 174803007

Nail Polish Try-On: Realtime Semantic Segmentation of Small Objects for Native and Browser Smartphone AR Applications

  title={Nail Polish Try-On: Realtime Semantic Segmentation of Small Objects for Native and Browser Smartphone AR Applications},
  author={Brendan Duke and Abdalla Ahmed and Edmund Phung and I. Kezele and P. Aarabi},
We provide a system for semantic segmentation of small objects that enables nail polish try-on AR applications to run client-side in realtime in native and web mobile applications. By adjusting input resolution and neural network depth, our model design enables a smooth trade-off of performance and runtime, with the highest performance setting achieving~\num{94.5} mIoU at 29.8ms runtime in native applications on an iPad Pro. We also provide a postprocessing and rendering algorithm for nail… Expand


ICNet for Real-Time Semantic Segmentation on High-Resolution Images
An image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address the challenging task of real-time semantic segmentation is proposed and in-depth analysis of the framework is provided. Expand
MobileNetV2: Inverted Residuals and Linear Bottlenecks
A new mobile architecture, MobileNetV2, is described that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes and allows decoupling of the input/output domains from the expressiveness of the transformation. Expand
Pelee: A Real-Time Object Detection System on Mobile Devices
A real-time object detection system by combining PeleeNet with Single Shot MultiBox Detector (SSD) method and optimizing the architecture for fast speed and the result on COCO outperforms YOLOv2 in consideration of a higher precision. Expand
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
An extremely computation-efficient CNN architecture named ShuffleNet is introduced, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs), to greatly reduce computation cost while maintaining accuracy. Expand
Laplacian Reconstruction and Refinement for Semantic Segmentation
A multi-resolution reconstruction architecture, akin to a Laplacian pyramid, that uses skip connections from higher resolution feature maps to successively refine segment boundaries reconstructed from lower resolution maps is described. Expand
Optimized Block-Based Connected Components Labeling With Decision Trees
A new paradigm for eight-connection labeling is defined, which employs a general approach to improve neighborhood exploration and minimizes the number of memory accesses, and a new scanning technique that moves on a 2 × 2 pixel grid over the image, which is optimized by the automatically generated decision tree. Expand
Loss Max-Pooling for Semantic Image Segmentation
A novel loss max-pooling concept for handling imbalanced training data distributions, applicable as alternative loss layer in the context of deep neural networks for semantic image segmentation, and adaptively re-weights the contributions of each pixel based on their observed losses. Expand
Automatic differentiation in PyTorch
An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Expand
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size
This work proposes a small DNN architecture called SqueezeNet, which achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters and is able to compress to less than 0.5MB (510x smaller than AlexNet). Expand
Learning Transferable Architectures for Scalable Image Recognition
This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. Expand