1st Place Solution to ICDAR 2021 RRC-ICTEXT End-to-end Text Spotting and Aesthetic Assessment on Integrated Circuit
@article{Wang20211stPS, title={1st Place Solution to ICDAR 2021 RRC-ICTEXT End-to-end Text Spotting and Aesthetic Assessment on Integrated Circuit}, author={Qiyao Wang and Pengfei Li and Li Zhu and Yi Niu}, journal={ArXiv}, year={2021}, volume={abs/2104.03544} }
This paper presents our proposed methods to ICDAR 2021 Robust Reading Challenge Integrated Circuit Text Spotting and Aesthetic Assessment (ICDAR RRCICTEXT 2021). For the text spotting task, we detect the characters on integrated circuit and classify them based on yolov5 detection model. We balance the lowercase and non-lowercase by using SynthText, generated data and data sampler. We adopt semi-supervised algorithm and distillation to furtherly improve the model’s accuracy. For the aesthetic…
One Citation
ICDAR 2021 Competition on Integrated Circuit Text Spotting and Aesthetic Assessment
- Computer ScienceICDAR
- 2021
A text on chips dataset, ICText is used as the main target for the proposed Robust Reading Challenge on Integrated Circuit Text Spotting and Aesthetic Assessment (RRC-ICText) 2021 to encourage the research on this problem.
References
SHOWING 1-8 OF 8 REFERENCES
IoU-aware Single-stage Object Detector for Accurate Localization
- Computer ScienceImage Vis. Comput.
- 2020
YOLOv4: Optimal Speed and Accuracy of Object Detection
- Computer ScienceArXiv
- 2020
This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.
Editing Text in the Wild
- Computer ScienceACM Multimedia
- 2019
This work proposes an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module and fusion module, which is the first attempt to edit text in natural images at the word level.
LVIS: A Dataset for Large Vocabulary Instance Segmentation
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This work introduces LVIS (pronounced ‘el-vis’): a new dataset for Large Vocabulary Instance Segmentation, which has a long tail of categories with few training samples due to the Zipfian distribution of categories in natural images.
Weighted Boxes Fusion: ensembling boxes for object detection models
- Computer ScienceArXiv
- 2019
A novel Weighted Box Fusion (WBF) ensembling algorithm that boosts the performance by ensembled predictions from different object detection models by boosting the performance on predictions of different models trained on large Open Images Dataset.
Synthetic Data for Text Localisation in Natural Images
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
The relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning, are discussed.
Distilling the Knowledge in a Neural Network
- Computer ScienceArXiv
- 2015
This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
Icdar rrc-ictext