Table Recognition and Understanding from PDF Files

@article{Hassan2007TableRA,
  title={Table Recognition and Understanding from PDF Files},
  author={Tamir Hassan and Robert Baumgartner},
  journal={Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)},
  year={2007},
  volume={2},
  pages={1143-1147}
}
We propose a flexible method for detecting and understanding tables in PDF files, which is not reliant upon one particular feature being present, for example ruling lines or indentations, and is therefore applicable to a wide variety of visual presentations. We describe the steps required in transforming the low-level PDF instructions into text segments, lines and boxes on a page. We propose three different classifications for published tables, and develop methods to detect these tables and… CONTINUE READING
Highly Cited
This paper has 56 citations. REVIEW CITATIONS

4 Figures & Tables

Topics

Statistics

010202009201020112012201320142015201620172018
Citations per Year

56 Citations

Semantic Scholar estimates that this publication has 56 citations based on the available data.

See our FAQ for additional information.