MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

  title={MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition},
  author={Ayan Kumar Bhunia and S. Ghose and Amandeep Kumar and Pinaki Nath Chowdhury and Aneeshan Sain and Yi-Zhe Song},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • A. Bhunia, S. Ghose, Yi-Zhe Song
  • Published 5 April 2021
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. Prior works however generally operate with the assumption that there is a limited number of styles, most of which have already been captured by existing datasets. In this paper, we take a completely different perspective – we work on the assumption that there is always a new style that is drastically different, and that we will only have very limited data… 

Figures and Tables from this paper

APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic Text Image Generation
Style-guided text image generation tries to synthesize text image by imitating reference image’s appearance while keeping text content unaltered. The text image appearance includes many aspects. In
Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition
It is argued that semantic information offers a complementary role in addition to visual only by proposing a multi-stage multi-scale attentional decoder that performs joint visual-semantic reasoning in a stage-wise manner.
Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation
This paper aims for a single model that can compete favourably with two separate state-of-the-art STR and HTR models, and proposes four distillation losses, all of which are specifically designed to cope with the aforementioned unique characteristics of text recognition.


ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
This work introduces ASTER, an end-to-end neural network model that comprises a rectification network and a recognition network that predicts a character sequence directly from the rectified image.
Learning Meta Face Recognition in Unseen Domains
This paper proposes a novel face recognition method via meta-learning named Meta Face Recognition (MFR), which synthesizes the source/target domain shift with a meta-optimization objective, which requires the model to learn effective representations not only on synthesized source domains but also on synthesizer target domains.
Learning to Forget for Meta-Learning
This work proposes task-and-layer-wise attenuation on the compromised initialization of model-agnostic meta-learning to reduce its influence and names the method as L2F (Learn to Forget).
SCATTER: Selective Context Attentional Scene Text Recognizer
A novel architecture for STR is introduced, named Selective Context ATtentional Text Recognizer (SCATTER), that utilizes a stacked block architecture with intermediate supervision during training, that paves the way to successfully train a deep BiLSTM encoder, thus improving the encoding of contextual dependencies.
Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition
This work proposes an easy-to-implement strong baseline for irregular scene text recognition, using off- the-shelf neural network components and only word-level annotations, and achieves state-of-the-art performance on both regular and irregular sceneText recognition benchmarks.
Metasgd: Learning to learn quickly for few-shot learning
  • arXiv preprint arXiv:1707.09835,
  • 2017
The IAM-database: an English sentence database for offline handwriting recognition
A database that consists of handwritten English sentences based on the Lancaster-Oslo/Bergen corpus, which is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used.
GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images
This work proposes a novel method that is able to produce credible handwritten word images by conditioning the generative process with both calligraphic style features and textual content and significantly advance over prior art.
Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
Extensive experiments on various benchmarks show that the proposed augmentation and the joint learning methods significantly boost the performance of the recognition networks.
ICDAR 2009 Handwriting Recognition Competition
  • E. Grosicki, H. E. Abed
  • Computer Science
    2009 10th International Conference on Document Analysis and Recognition
  • 2009
This paper describes the handwriting recognition competition held at ICDAR 2009, based on the RIMES-database, with French written text documents, which shows interesting results.