SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network

@inproceedings{Liu2018SqueezedTextAR,
  title={SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network},
  author={Zichuan Liu and Yixing Li and Fengbo Ren and Wang Ling Goh and Hao Yu},
  booktitle={AAAI},
  year={2018}
}
A new approach for real-time scene text recognition is proposed in this paper. [] Key Method With the elaborated character detection, the back-end Bi-RNN merely processes a low dimension feature sequence with category and spatial information of extracted characters for sequence correction and classification. By training with over 1,000,000 synthetic scene text images, the B-CEDNet achieves a recall rate of 0.86, precision of 0.88 and F-score of 0.87 on ICDAR-03 and ICDAR-13. With the correction and…

Figures and Tables from this paper

A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition
TLDR
This work proposes a simple yet robust approach for scene text recognition with no need to convert input images to sequence representations, and directly connects two-dimensional CNN features to an attention-based sequence decoder.
A Simple and Strong Convolutional-Attention Network for Irregular Text Recognition
TLDR
This work proposes a simple yet robust approach for scene text recognition with no need to convert input images to sequence representations, and directly connects two-dimensional CNN features to an attention-based sequence decoder.
Flexible scene text recognition based on dual attention mechanism
TLDR
An end‐to‐end trainable and flexible STR method based on a dual attention mechanism that is comparable to 13 existing methods and average text recognition accuracy of the proposed method is about 1.4% higher than the state‐of‐the‐art method.
Sequential alignment attention model for scene text recognition
FACLSTM: ConvLSTM with focused attention for scene text recognition
TLDR
This paper argues that scene text recognition is essentially a spatiotemporal prediction problem for its 2-D image inputs, and proposes a convolution LSTM (ConvLSTM)-based scene text recognizer, namely, FACL STM, where the spatial correlation of pixels is fully leveraged when performing sequential prediction with L STM.
Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition
TLDR
Extensive text recognition experiments show the SSDAN could efficiently transfer sequence knowledge and validate the promising power of the proposed model towards real world applications in various recognition scenarios, including the natural scene text, handwritten text and even mathematical expression recognition.
...
...

References

SHOWING 1-10 OF 37 REFERENCES
Reading Scene Text in Deep Convolutional Sequences
TLDR
A deep recurrent model is developed to robustly recognize the generated CNN sequences, departing from most existing approaches recognising each character independently, achieving impressive results on several benchmarks, advancing the state-of-the-art substantially.
Reading Text in the Wild with Convolutional Neural Networks
TLDR
An end-to-end system for text spotting—localising and recognising text in natural scene images—and text based image retrieval and a real-world application to allow thousands of hours of news footage to be instantly searchable via a text query is demonstrated.
An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition
TLDR
A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed, which generates an effective yet much smaller model, which is more practical for real-world application scenarios.
End-to-end text recognition with convolutional neural networks
TLDR
This paper combines the representational power of large, multilayer neural networks together with recent developments in unsupervised feature learning, which allows them to use a common framework to train highly-accurate text detector and character recognizer modules.
Accurate Scene Text Recognition Based on Recurrent Neural Network
TLDR
This paper presents a novel approach to recognize text in scene images that outperforms the state-of-the-art techniques significantly and is able to recognize the whole word images without character-level segmentation and recognition.
Deep Features for Text Spotting
TLDR
A Convolutional Neural Network classifier is developed that can be used for text spotting in natural images and a method of automated data mining of Flickr, that generates word and character level annotations is used to form an end-to-end, state-of-the-art text spotting system.
Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition
In this work we present a framework for the recognition of natural scene text. Our framework does not require any human-labelled data, and performs word recognition on the whole image holistically,
End-to-end scene text recognition
TLDR
While scene text recognition has generally been treated with highly domain-specific methods, the results demonstrate the suitability of applying generic computer vision methods.
Label Embedding: A Frugal Baseline for Text Recognition
TLDR
The main conclusion of the paper is that with such a frugal approach it is possible to obtain results which are competitive with standard bottom-up approaches, thus establishing label embedding as an interesting and simple to compute baseline for text recognition.
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
TLDR
Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
...
...