Extracting Scientific Figures with Distantly Supervised Neural Networks

@article{Siegel2018ExtractingSF,
  title={Extracting Scientific Figures with Distantly Supervised Neural Networks},
  author={N. Siegel and Nicholas Lourie and R. Power and Waleed Ammar},
  journal={Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries},
  year={2018}
}
  • N. Siegel, Nicholas Lourie, +1 author Waleed Ammar
  • Published 2018
  • Computer Science
  • Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries
  • Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction. [...] Key Method We share the resulting dataset of over 5.5 million induced labels---4,000 times larger than the previous largest figure extraction dataset---with an average precision of 96.8%, to enable the development of modern data-driven methods for this task.Expand Abstract
    Construction of the Literature Graph in Semantic Scholar
    • 85
    • Open Access
    TableBank: Table Benchmark for Image-based Table Detection and Recognition
    • 16
    • Highly Influenced
    • Open Access
    DS4A: Deep Search System for Algorithms from Full-Text Scholarly Big Data
    • 3
    Image-based table recognition: data, model, and evaluation
    • 3
    • Open Access
    IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
    • 2
    • Highly Influenced
    • Open Access
    Data-Driven Recognition and Extraction of PDF Document Elements
    • 1
    • Highly Influenced
    • Open Access

    References

    Publications referenced by this paper.
    SHOWING 1-2 OF 2 REFERENCES
    Deep Residual Learning for Image Recognition
    • 49,669
    • Highly Influential
    • Open Access
    PDFFigures 2.0: Mining figures from research papers
    • 42
    • Highly Influential
    • Open Access