Page Layout Analysis System for Unconstrained Historic Documents

@inproceedings{Kodym2021PageLA,
  title={Page Layout Analysis System for Unconstrained Historic Documents},
  author={O. Kodym and Michal Hradi{\vs}},
  booktitle={ICDAR},
  year={2021}
}
Extraction of text regions and individual text lines from historic documents is necessary for automatic transcription. We propose extending a CNN-based text baseline detection system by adding line height and text block boundary predictions to the model output, allowing the system to extract more comprehensive layout information. We also show that pixel-wise text orientation prediction can be used for processing documents with multiple text orientations. We demonstrate that the proposed method… Expand

Figures and Tables from this paper

AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions
TLDR
This paper addresses text recognition for domains with limited manual annotations by a simple self-training strategy and proposes to train a seed system on large scale data from related domains mixed with available annotated data from the target domain. Expand
TS-Net: OCR Trained to Switch Between Text Transcription Styles
TLDR
The proposed Transcription Style Block (TSB) is an adaptive instance normalization conditioned by identifiers representing consistently transcribed documents which can learn from data to switch between multiple transcription styles without any explicit knowledge of transcription rules. Expand

References

SHOWING 1-10 OF 23 REFERENCES
Multi-Task Handwritten Document Layout Analysis
TLDR
A system based on artificial neural networks which is able to determine not only the baselines of text lines present in the document, but also performs geometric and logic layout analysis of the document. Expand
Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization
TLDR
An end-to-end trainable framework for restoring historical documents content that follows the correct reading order and a re-score mechanism to minimize recognition error is proposed. Expand
Complete System for Text Line Extraction Using Convolutional Neural Networks and Watershed Transform
TLDR
A novel Convolutional Neural Network based method for the extraction of text lines, which consists of an initial Layout Analysis followed by the estimation of the Main Body Area for each text line, outperforming existing learning-based text line extraction methods. Expand
Dense prediction for text line segmentation in handwritten document images
  • Q. Vo, Gueesang Lee
  • Computer Science
  • 2016 IEEE International Conference on Image Processing (ICIP)
  • 2016
TLDR
A fully convolutional network (FCN) is trained to predict text line structure in document images and line adjacency graph (LAG) is used to separate the touching characters into different text strings. Expand
Textline detection in degraded historical document images
TLDR
This paper improves the performance of binarization by detecting the non-text regions and processing only text regions and improves the textline detection method by extracting main textblock and compensating the skew angle and writing style. Expand
Labeling, Cutting, Grouping: An Efficient Text Line Segmentation Method for Medieval Manuscripts
TLDR
This work proposes a novel method which uses semantic segmentation at pixel level as intermediate task, followed by a text-line extraction step, and demonstrates that semantic pixel segmentation can be used as strong denoising pre-processing step before performing text line extraction. Expand
Text Line Segmentation in Historical Document Images Using an Adaptive U-Net Architecture
TLDR
A novel deep learning-based method for text line segmentation of historical documents based on using an adaptive U-Net architecture is presented. Expand
Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks
TLDR
An end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images using a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Expand
docExtractor: An off-the-shelf historical document element extraction
  • Tom Monnier, Mathieu Aubry
  • Computer Science
  • 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)
  • 2020
TLDR
It is argued that the performance obtained without fine-tuning on a specific dataset is critical for applications, in particular in digital humanities, and that the line-level page segmentation is the most relevant for a general purpose element extraction engine. Expand
Fast and Lightweight Text Line Detection on Historical Documents
TLDR
A novel method for handwriting text line detection, which provides a balance between accuracy and computational efficiency and is suitable for mobile and embedded devices and significantly faster than other existing methods with comparable accuracy. Expand
...
1
2
3
...