PDFFigures 2.0: Mining figures from research papers

@article{Clark2016PDFFigures2M,
  title={PDFFigures 2.0: Mining figures from research papers},
  author={Christopher Clark and Santosh Kumar Divvala},
  journal={2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)},
  year={2016},
  pages={143-152}
}
  • Christopher Clark, Santosh Kumar Divvala
  • Published 2016
  • Computer Science
  • 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)
  • Figures and tables are key sources of information in many scholarly documents. However, current academic search engines do not make use of figures and tables when semantically parsing documents or presenting document summaries to users. To facilitate these applications we develop an algorithm that extracts figures, tables, and captions from documents called “PDFFigures 2.0.” Our proposed approach analyzes the structure of individual pages by detecting captions, graphical elements, and chunks of… CONTINUE READING

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 41 CITATIONS

    Extracting Figures and Captions from Scientific Publications

    VIEW 3 EXCERPTS
    CITES BACKGROUND & METHODS

    Figure and caption extraction from biomedical documents

    VIEW 3 EXCERPTS
    CITES METHODS & RESULTS
    HIGHLY INFLUENCED

    Convolutional Neural Networks for Figure Extraction in Historical Technical Documents

    VIEW 2 EXCERPTS
    CITES METHODS

    Extracting Scientific Figures with Distantly Supervised Neural Networks

    VIEW 13 EXCERPTS
    CITES BACKGROUND & METHODS
    HIGHLY INFLUENCED

    DocFigure: A Dataset for Scientific Document Figure Classification

    Data-Driven Recognition and Extraction of PDF Document Elements

    VIEW 7 EXCERPTS
    CITES METHODS, RESULTS & BACKGROUND
    HIGHLY INFLUENCED

    FILTER CITATIONS BY YEAR

    2017
    2020

    CITATION STATISTICS

    • 8 Highly Influenced Citations

    • Averaged 10 Citations per year from 2018 through 2020