PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

@article{Li2022PPOCRv3MA,
  title={PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System},
  author={Chenxia Li and Weiwei Liu and Ruoyu Guo and Xiaoyue Yin and Kaitao Jiang and Yongkun Du and Yuning Du and Lingfeng Zhu and Baohua Lai and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
  journal={ArXiv},
  year={2022},
  volume={abs/2206.03001}
}
Optical character recognition (OCR) technology has been widely used in various scenarios, as shown in Figure 1. De-signing a practical OCR system is still a meaningful but chal- lenging task. In previous work, considering the efficiency and accuracy, we proposed a practical ultra lightweight OCR system (PP-OCR), and an optimized version PP-OCRv2. In order to further improve the performance of PP-OCRv2, a more robust OCR system PP-OCRv3 is proposed in this paper. PP-OCRv3 upgrades the text… 

Optical Character Recognition of Electrical Equipment Nameplate with Contrast Enhancement

An OCR system that can reliably handle the common reflection effects shown in the electrical equipment nameplate images captured under challenging ambient light conditions is proposed by applying a robust contrast enhancement method that is based on a logarithmic mapping function (LMF) to the Electrical equipment nameplates images prior to OCR with one of the state-of-the-art OCR networks, PPOCRv3.

Evaluating and Improving Optical Character Recognition (OCR) Efficiency in Recognizing Mandarin Phrases with Phonetic Symbols

  • S. LoH. Chou
  • Computer Science
    2022 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)
  • 2022
This study conducts experiments on recognizing images with mandarin phrases with phonetic symbols by the side using the OCR system and proposes candidate methods to improve recognition efficiency in the future based on preliminary results.

CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

A novel network named CDistNet is developed that stacks multiple MDCDPs to guide a gradually precise distance modeling and proves that the feature-character alignment is well built even various recognition difficulties presented.

Research on Text Recognition of Natural Scenes for Complex Situations

The continuous development of sceneText detection and recognition algorithm system will lay the foundation for the research of recognition problems such as multilingual recognition of scene text and formula recognition.

References

SHOWING 1-10 OF 22 REFERENCES

PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System

A bag of tricks to train a better text detector and a bettertext recognizer, which include Collaborative Mutual Learning (CML), CopyPaste, Lightweight CPU Network, Unified-Deep Mutual Learning, U-DML and Enhanced CTCLoss is introduced.

PP-OCR: A Practical Ultra Lightweight OCR System

This paper proposes a practical ultra lightweight OCR system, i.e., PP-OCR, with an overall model size of only 3.5M, and introduces a bag of strategies to either enhance the model ability or reduce the model size.

ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition — RRC-MLT-2019

The dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge are presented, which has 4 tasks covering various aspects of multi-lingual scene text.

GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition

This work proposes the guided training of CTC (GTC), where CTC model learns a better alignment and feature representations from a more powerful attentional guidance, and achieves robust and accurate prediction for both regular and irregular scene text while maintaining a fast inference speed.

PP-LCNet: A Lightweight CPU Convolutional Neural Network

A lightweight CPU network based on the MKLDNN acceleration strategy, named PP-LCNet, which improves the performance of lightweight models on multiple tasks and can greatly surpass the previous network structure with the same inference time for classification.

An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed, which generates an effective yet much smaller model, which is more practical for real-world application scenarios.

SVTR: Scene Text Recognition with a Single Visual Model

A Single Visual model for Scene Text recognition within the patch-wise image tokenization framework, which dispenses with the sequential modeling entirely and is effective on both English and Chinese scene text recognition tasks.

Context-Based Contrastive Learning for Scene Text Recognition

A novel framework, Context-based contrastive learning (ConCLR), that significantly improves out-of-vocabulary generalization and achieves state- of-the-art performance on public benchmarks together with attention-based recognizers.

Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline

This paper presents a novel network model which can predict the bounding box and recognize the corresponding LP number simultaneously with high speed and accuracy, and demonstrates the model outperforms current object detection and recognition approaches in both accuracy and speed.

ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction

The ICDAR 2019 Challenge on "Scanned receipts OCR and key information extraction" (SROIE) covers important aspects related to the automated analysis of scanned receipts, and is considered to evolve into a useful resource for the community, drawing further attention and promoting research and development efforts in this field.