Exemplar Driven Character Recognition in the Wild

  title={Exemplar Driven Character Recognition in the Wild},
  author={Karthik Sheshadri and Santosh Kumar Divvala},
  booktitle={British Machine Vision Conference},
Character recognition in natural scenes continues to represent a formidable challenge in computer vision. Traditional optical character recognition (OCR) methods fail to perform well on characters from scene text owing to a variety of difficulties in background clutter, binarisation, and arbitrary skew. Further, English characters group into only 62 classes whereas many of the world’s languages have several hundred classes. In particular, most Indic script languages such as Kannada exhibit… 

Figures and Tables from this paper

Comparative Study of Preprocessing and Classification Methods in Character Recognition of Natural Scene Images

A classification technique for classifying characters based on a pipeline of image processing operations and ensemble machine learning techniques that tackles problems where Optical Character Recognition (OCR) fails.

Character Recognition via a Compact Convolutional Neural Network

This paper proposes to use a deep learning method based on the convolutional neural networks to recognize this kind of characters and word in the scene images using the original VGG-Net, and shows that this method can achieve the state-of-art performance and at the same time has a more compact representation.

Exploiting Color Information for Better Scene Text Recognition

A pipeline of image processing operations involving the bilateral regression for the identification of characters in the images and a pre-processing step has been proposed to increase the performance of bilateral regression based character identification.

Recognition and Retrieval in Natural Scene Images

This thesis proposes an iterative method, which alternates between finding the most likely solution and refining the interaction potentials, and presents two contrasting end to end recognition frameworks for scene text analysis on scene images.

Multilingual Scene Character Recognition System using Sparse Auto-Encoder for Efficient Local Features Representation in Bag of Features

The Bag of Features-based model was extended using deep learning for representing features for accurate SCR of different languages and a deep Sparse Auto-encoder (SAE)-based strategy was applied to enhance the representative and discriminative abilities of image features.

Understanding Text in Scene Images

This thesis proposes a robust text segmentation (binarization) technique, and uses it to improve the recognition performance of scene text and presents an energy minimization framework that exploits both bottom-up and top-down cues for recognizing words extracted from street images.

A Large Chinese Text Dataset in the Wild

This paper provides details of a newly created dataset of Chinese text with about 1 million Chinese characters from 3 850 unique ones annotated by experts in over 30 000 street view images and gives baseline results using state-of-the-art methods.

Learning Spatially Embedded Discriminative Part Detectors for Scene Character Recognition

A discriminative character representation is proposed by aggregating the responses of the spatially embedded salient part detectors by extracting the convolution activations from the pre-trained convolutional neural network (CNN).

Chinese Text in the Wild

A newly created dataset of Chinese text with about 1 million Chinese characters annotated by experts in over 30 thousand street view images, suitable for training robust neural networks for various tasks, particularly detection and recognition.

Scene character recognition using PCANet

The proposed method achieves promising performance on the Chars74K-15 dataset and the ICDAR03-CH dataset, demonstrating the effectiveness of PCANet in scene character recognition.



Character Recognition in Natural Images

It is demonstrated that the performance of the proposed method can be far superior to that of commercial OCR systems, and can benefit from synthetically generated training data obviating the need for expensive data collection and annotation.

Word Spotting in the Wild

It is argued that the appearance of words in the wild spans this range of difficulties and a new word recognition approach based on state-of-the-art methods from generic object recognition is proposed, in which object categories are considered to be the words themselves.

End-to-end scene text recognition

While scene text recognition has generally been treated with highly domain-specific methods, the results demonstrate the suitability of applying generic computer vision methods.

An Exemplar Model for Learning Object Classes

An exemplar model that can learn and generate a region of interest around class instances in a training set, given only a set of images containing the visual class, which enables the detection of multiple instances of the object class in test images.

A Method for Text Localization and Recognition in Real-World Images

The paper is first to report both text detection and recognition results on the standard and rather challenging ICDAR 2003 dataset, and the text localization works for number of alphabets and the method is easily adapted to recognition of other scripts, e.g. cyrillics.

Ensemble of exemplar-SVMs for object detection and beyond

This paper proposes a conceptually simple but surprisingly powerful method which combines the effectiveness of a discriminative object detector with the explicit correspondence offered by a

ICDAR 2003 robust reading competitions

The robust reading problem was broken down into three sub-problems, and competitions for each stage, and also a competition for the best overall system, which was the only one to have any entries.

Histograms of oriented gradients for human detection

  • N. DalalB. Triggs
  • Computer Science
    2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
  • 2005
It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

Poselets: Body part detectors trained using 3D human pose annotations

A new dataset, H3D, is built of annotations of humans in 2D photographs with 3D joint information, inferred using anthropometric constraints, to address the classic problems of detection, segmentation and pose estimation of people in images with a novel definition of a part, a poselet.

Shape matching and object recognition using shape contexts

This paper presents work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation, and proposes shape detection using a feature called shape context, which is descriptive of the shape of the object.