Monocular Semantic Occupancy Grid Mapping With Convolutional Variational Encoder–Decoder Networks

@article{Lu2019MonocularSO,
  title={Monocular Semantic Occupancy Grid Mapping With Convolutional Variational Encoder–Decoder Networks},
  author={Chenyang Lu and M. V. D. van de Molengraft and Gijs Dubbelman},
  journal={IEEE Robotics and Automation Letters},
  year={2019},
  volume={4},
  pages={445-452}
}
  • Chenyang Lu, M. V. D. van de Molengraft, Gijs Dubbelman
  • Published 2019
  • Engineering, Computer Science
  • IEEE Robotics and Automation Letters
  • In this letter, we research and evaluate end-to-end learning of monocular semantic-metric occupancy grid mapping from weak binocular ground truth. The network learns to predict four classes, as well as a camera to bird's eye view mapping. At the core, it utilizes a variational encoder–decoder network that encodes the front-view visual information of the driving scene and subsequently decodes it into a two-dimensional top-view Cartesian coordinate system. The evaluations on Cityscapes show that… CONTINUE READING
    Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks
    1
    Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning
    Mono Lay out: Amodal scene layout from a single image
    1
    On Boosting Semantic Street Scene Segmentation with Weak Supervision
    2
    Semantic Foreground Inpainting From Weak Supervision
    1

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 43 REFERENCES
    CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction
    286
    Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
    1511
    Unsupervised Monocular Depth Estimation with Left-Right Consistency
    1014
    SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    3027
    Fully Convolutional Networks for Semantic Segmentation
    6589
    Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction
    133