A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

@article{Chung2019ACE,
  title={A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition},
  author={Jonathan Chung and Thomas Delteil},
  journal={2019 International Conference on Document Analysis and Recognition Workshops (ICDARW)},
  year={2019},
  volume={5},
  pages={35-40}
}
  • Jonathan Chung, Thomas Delteil
  • Published 1 September 2019
  • Computer Science
  • 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW)
Offline handwriting recognition with deep neural networks is usually limited to words or lines due to large computational costs. In this paper, a less computationally expensive full page offline handwritten text recognition framework is introduced. This framework includes a pipeline that locates handwritten text with an object detection neural network and recognises the text within the detected regions using features extracted with a multi-scale convolutional neural network (CNN) fed into a… 
Offline Handwritten Text Recognition Using Deep Learning: A Review
TLDR
How this problem has been handled in the past few decades is introduced, the latest advancements and the potential directions for future research in this field are analyzed.
SPAN: a Simple Predict & Align Network for Handwritten Paragraph Recognition
TLDR
The Simple Predict & Align Network is proposed: an end-to-end recurrence-free Fully Convolutional Network performing OCR at paragraph level without any prior segmentation stage without any loss of accuracy.
A Sequential Handwriting Recognition Model Based on a Dynamically Configurable CRNN
TLDR
The proposed DC-CRNN is based on the Salp Swarm Optimization Algorithm, which generates the optimal structure and hyperparameters for Convolutional Recurrent Neural Networks (CRNNs) and outperforms the handcrafted CRNN methods.
A Review on Document Information Extraction Approaches
Information extraction from documents has become great use of novel natural language processing areas. Most of the entity extraction methodologies are variant in a context such as medical area,
End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network
TLDR
This work proposes a unified end-to-end model using hybrid attention to tackle handwritten text recognition and achieves state-of-the-art character error rate at line and paragraph levels on three popular datasets.
OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold
TLDR
A novel and simple neural network module is proposed, termed OrigamiNet, that can augment any CTC-trained, fully convolutional single line text recognizer, to convert it into a multi-line version by providing the model with enough spatial capacity to be able to properly collapse a 2D input signal into 1D without losing information.

References

SHOWING 1-10 OF 23 REFERENCES
Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks
Offline handwriting recognition—the automatic transcription of images of handwritten text—is a challenging task that combines computer vision with sequence learning. In most systems the two elements
Fully convolutional network with dilated convolutions for handwritten text line segmentation
TLDR
A learning-based method for handwritten text line segmentation in document images using a variant of deep fully convolutional networks (FCNs) with dilated convolutions that outperforms the most popular variants of FCN, based on deconvolution or unpooling layers, on a public dataset.
Start, Follow, Read: End-to-End Full-Page Handwriting Recognition
TLDR
A deep learning model that jointly learns text detection, segmentation, and recognition using mostly images without detection or segmentation annotations, which exceeds the performance of the winner of the ICDAR2017 handwriting recognition competition.
Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
TLDR
A modification of the popular and efficient multi-dimensional long short-term memory recurrent neural networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs, replacing the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can recognize one line at a time.
Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?
  • J. Puigcerver
  • Computer Science
    2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
  • 2017
TLDR
This work observes that using random distortions during training as synthetic data augmentation dramatically improves the accuracy of the model, suggesting that the two-dimensional long-term dependencies may not be essential to achieve a good recognition accuracy, at least in the lower layers of the architecture.
Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks
TLDR
The proposed architecture is a fully convolutional network without any recurrent connections trained with the CTC loss function, which operates on arbitrary input sizes and produces strings of arbitrary length in a very efficient and parallelizable manner.
Handwriting Recognition by Attribute Embedding and Recurrent Neural Networks
TLDR
This paper proposes a handwriting recognition method that adapts the attribute embedding to sequence learning and obtains promising results even without the use of any kind of dictionary or language model.
Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
TLDR
An attention-based model for end-to-end handwriting recognition that is able to learn the reading order, enabling it to handle bidirectional scripts such as Arabic and bring hope to perform full paragraph transcription in the near future.
A Compact CNN-DBLSTM Based Character Model for Offline Handwriting Recognition with Tucker Decomposition
  • Haisong Ding, Kai Chen, +4 authors Qiang Huo
  • Computer Science
    2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
  • 2017
TLDR
The results show that using Tucker decomposition alone offers a good solution to building a compact CNN-DBLSTM model which can reduce significantly both the footprint and latency yet without degrading recognition accuracy.
Dropout Improves Recurrent Neural Networks for Handwriting Recognition
TLDR
It is shown that RNNs with Long Short-Term memory cells can be greatly improved using dropout - a recently proposed regularization method for deep architectures, even when the network mainly consists of recurrent and shared connections.
...
1
2
3
...