READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

@article{Grning2017READBADAN,
  title={READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents},
  author={Tobias Gr{\"u}ning and Roger Labahn and Markus Diem and Florian Kleber and Stefan Fiel},
  journal={2018 13th IAPR International Workshop on Document Analysis Systems (DAS)},
  year={2017},
  pages={351-356}
}
  • Tobias Grüning, Roger Labahn, +2 authors Stefan Fiel
  • Published in
    13th IAPR International…
    2017
  • Computer Science
  • Text line detection is crucial for any application associated with Automatic Text Recognition or Keyword Spotting. Modern algorithms perform good on well-established datasets since they either comprise clean data or simple/homogeneous page layouts. We have collected and annotated 2036 archival document images from different locations and time periods. The dataset contains varying page layouts and degradations that challenge text line segmentation methods. Well established text line segmentation… CONTINUE READING

    Create an AI-powered research feed to stay up to date with new papers like this posted to ArXiv

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 20 CITATIONS

    A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Arabic Handwritten Documents Segmentation into Text-Lines and Words using Deep Learning

    VIEW 1 EXCERPT
    CITES METHODS

    End-to-End Measure for Text Recognition

    VIEW 3 EXCERPTS
    CITES BACKGROUND & METHODS

    Quality-Aware Human-Machine Text Extraction for Biocollections using Ensembles of OCRs

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Robust Keypoint Detection

    VIEW 1 EXCERPT
    CITES METHODS

    cBAD: ICDAR2019 Competition on Baseline Detection

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Baseline Detection in Historical Documents Using Convolutional U-Nets

    VIEW 2 EXCERPTS
    CITES BACKGROUND

    Binarization Free Layout Analysis for Arabic Historical Documents Using Fully Convolutional Networks

    VIEW 1 EXCERPT
    CITES BACKGROUND

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 27 REFERENCES

    Page Segmentation for Historical Document Images Based on Superpixel Classification with Unsupervised Feature Learning

    VIEW 1 EXCERPT

    ICDAR 2015 competition on text line detection in historical documents

    VIEW 3 EXCERPTS

    ICDAR2015 Competition on Keyword Spotting for Handwritten Documents

    VIEW 1 EXCERPT

    ICDAR2015 competition on recognition of documents with complex layouts - RDCL2015

    VIEW 1 EXCERPT