Corpus ID: 214623410

A Simple Fix for Convolutional Neural Network via Coordinate Embedding

@article{Ren2020ASF,
  title={A Simple Fix for Convolutional Neural Network via Coordinate Embedding},
  author={Liliang Ren and Zhuonan Hao},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.10589}
}
Convolutional Neural Networks (CNN) has been widely applied in the realm of computer vision. However, given the fact that CNN models are translation invariant, they are not aware of the coordinate information of each pixel. Thus the generalization ability of CNN will be limited since the coordinate information is crucial for a model to learn affine transformations which directly operate on the coordinate of each pixel. In this project, we proposed a simple approach to incorporate the coordinate… Expand

References

SHOWING 1-6 OF 6 REFERENCES
An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
TLDR
Preliminary evidence that swapping convolution for CoordConv can improve models on a diverse set of tasks is shown, which works by giving convolution access to its own input coordinates through the use of extra coordinate channels without sacrificing the computational and parametric efficiency of ordinary convolution. Expand
SSD: Single Shot MultiBox Detector
TLDR
The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component. Expand
Evaluation of deep neural networks for traffic sign detection systems
TLDR
This paper analyses the state-of-the-art of several object-detection systems combined with various feature extractors previously developed by their corresponding authors, finding that Faster R-CNN Inception Resnet V2 obtains the best mAP, while R-FCN Resnet 101 strikes the best trade-off between accuracy and execution time. Expand
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of sceneExpand
YOLO9000: Better, Faster, Stronger
TLDR
YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories, is introduced and a method to jointly train on object detection and classification is proposed, both novel and drawn from prior work. Expand
Detection of traffic signs in real-world images: The German traffic sign detection benchmark
TLDR
This work introduces a real-world benchmark data set for traffic sign detection together with carefully chosen evaluation metrics, baseline results, and a web-interface for comparing approaches, and presents the best-performing algorithms of the IJCNN competition. Expand