I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection

@article{Ye2022I3CLIA,
  title={I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection},
  author={Jian Ye and Jing Zhang and Juhua Liu and Bo Du and Dacheng Tao},
  journal={Int. J. Comput. Vis.},
  year={2022},
  volume={130},
  pages={1961-1977}
}
Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i.e. , 1) fracture detections at the gaps in a text instance; and 2) inaccurate detections of arbitrary-shaped text instances with diverse background context. To address these issues, we propose a novel method named I ntra-and I nter- I nstance C ollaborative L earning (I3CL). Specif-ically, to address the first issue, we design an effective convolutional module with multiple receptive fields, which is… 

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

A concise dynamic point scene text detection Transformer network termed DPText-DETR, which directly uses point coordinates as queries and dynamically updates them between decoder layers and an Enhanced Factorized Self-Attention module is designed to explicitly model the circular shape of polygon point sequences beyond non-local attention.

Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

The first attempt to perform textual reasoning based on visual semantics in this paper is made by paralleling GTR to the language model in a segmentation-based STR baseline, which can effectively exploit the visual-linguistic complementarity via mutual learning.

References

SHOWING 1-10 OF 81 REFERENCES

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning

This paper proposes a novel segmentation-based text detector, namely SAST, which employs a context attended multi-task learning framework based on a Fully Convolutional Network to learn various geometric properties for the reconstruction of polygonal representation of text regions.

Learning Shape-Aware Embedding for Scene Text Detection

This work treats text detection as instance segmentation and proposes a segmentation-based framework, which extracts each text instance as an independent connected component and maps pixels onto an embedding space to distinguish different text instances.

ASTS: A Unified Framework for Arbitrary Shape Text Spotting

An end-to-end trainable unified framework to perceive and understand text based on different levels of semantics, holistic-, pixel- and sequence-level semantics, and then unify the recognized semantics for robust text spotting.

IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

A novel end-to-end scene text detector IncepText is proposed from an instance-aware segmentation perspective and deformable PSROI pooling is introduced to deal with multi-oriented text detection.

CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes

This work proposes an anchor-free scene text detector leveraging Center-aware Representation to achieve accurate arbitrary-shaped scene text detection namely CRNet and proposes a center-aware location algorithm to explicitly learn center regions and center points of text instances, which is able to separate adjacent text instances effectively.

Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection

A novel architecture to localize the text regions, which can deal with curved-shape scene texts and exploit the box-aware context-based text segmentation module and box refinement network to obtain the location of scene text.

TextField: Learning a Deep Direction Field for Irregular Scene Text Detection

A novel text detector named TextField, which outperforms the state-of-the-art methods by a large margin on two curved text datasets: Total-Text and SCUT-CTW1500, respectively; TextField also achieves very competitive performance on multi-oriented datasets: ICDAR 2015 and MSRA-TD500.

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

This paper proposes an end-to-end trainable text spotting approach named Text Perceptron, which unites text detection and the following recognition part into a whole framework, and helps the whole network achieve global optimization.

Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network

This paper proposes an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable post-processing.
...