Searching a High-Performance Feature Extractor for Text Recognition Network

  title={Searching a High-Performance Feature Extractor for Text Recognition Network},
  author={Hui Zhang and Quanming Yao and James Tin-Yau Kwok and Xiang Bai},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  • Hui ZhangQuanming Yao X. Bai
  • Published 12 September 2022
  • Computer Science
  • IEEE transactions on pattern analysis and machine intelligence
Feature extractor plays a critical role in text recognition (TR), but customizing its architecture is relatively less explored due to expensive manual tweaking. In this work, inspired by the success of neural architecture search (NAS), we propose to search for suitable feature extractors. We design a domain-specific search space by exploring principles for having good feature extractors. The space includes a 3D-structured space for the spatial model and a transformed-based space for the… 

Search to Pass Messages for Temporal Knowledge Graph Completion

This work proposes to use neural architecture search (NAS) to design data-specific message passing architecture for TKG completion and develops a generalized framework to explore topological and temporal information in TKGs.

Trans-UTPA: PSO and MADDPG based multi-UAVs trajectory planning algorithm for emergency communication

Transformer model is introduced to make Trans-UTPA's policy learning have no action space limitation and can be multi-task parallel, which improves the efficiency and generalization of sample processing.



AutoSTR: Efficient Backbone Search for Scene Text Recognition

This work designs a domain-specific search space for STR, which contains both choices on operations and constraints on the downsampling path, and proposes a two-step search algorithm, which decouples operations and downsampled path, for an efficient search in the given space.

An Efficient End-to-End Neural Model for Handwritten Text Recognition

A novel approach that combines a deep convolutional network with a recurrent Encoder-Decoder network to map an image to a sequence of characters corresponding to the text present in the image, making it both computationally and memory efficient.

Decoupled Attention Network for Text Recognition

A decoupled attention network (DAN), which decouples the alignment operation from using historical decoding results, and achieves state-of-the-art performance on multiple text recognition tasks, including offline handwritten text recognition and regular/irregular scene text recognition.

TextScanner: Reading Characters in Order for Robust Scene Text Recognition

TextScanner bears three characteristics: it belongs to the semantic segmentation family, as it generates pixel-wise, multi-channel segmentation maps for character class, position and order, and also adopts RNN for context modeling.

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

Theoretically, the proposed method, dubbed \emph{RobustScanner}, decodes individual characters with dynamic ratio between context and positional clues, and utilizes more positional ones when the decoding sequences with scarce context, and thus is robust and practical.

Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition

Extensive text recognition experiments show the SSDAN could efficiently transfer sequence knowledge and validate the promising power of the proposed model towards real world applications in various recognition scenarios, including the natural scene text, handwritten text and even mathematical expression recognition.

End-to-end scene text recognition

While scene text recognition has generally been treated with highly domain-specific methods, the results demonstrate the suitability of applying generic computer vision methods.

AON: Towards Arbitrarily-Oriented Text Recognition

The arbitrary orientation network (AON) is developed to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence and is comparable to major existing methods in regular datasets.

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

This work proposes an easy-to-implement strong baseline for irregular scene text recognition, using off- the-shelf neural network components and only word-level annotations, and achieves state-of-the-art performance on both regular and irregular sceneText recognition benchmarks.

Synthetically Supervised Feature Learning for Scene Text Recognition

This work designs a multi-task network with an encoder-discriminator-generator architecture to guide the feature of the original image toward that of the clean image, and significantly outperforms the state-of-the-art methods on standard scene text recognition benchmarks in the lexicon-free category.