Extracting Scientific Figures with Distantly Supervised Neural Networks

@article{Siegel2018ExtractingSF,
  title={Extracting Scientific Figures with Distantly Supervised Neural Networks},
  author={Noah Siegel and Nicholas Lourie and Russell Power and Waleed Ammar},
  journal={Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries},
  year={2018}
}
  • Noah Siegel, Nicholas Lourie, +1 author Waleed Ammar
  • Published 2018
  • Computer Science
  • Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries
  • Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction. [...] Key Method We share the resulting dataset of over 5.5 million induced labels---4,000 times larger than the previous largest figure extraction dataset---with an average precision of 96.8%, to enable the development of modern data-driven methods for this task.Expand Abstract

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 19 CITATIONS

    DS4A: Deep Search System for Algorithms from Full-Text Scholarly Big Data

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Data-Driven Recognition and Extraction of PDF Document Elements

    VIEW 4 EXCERPTS
    CITES METHODS & RESULTS
    HIGHLY INFLUENCED

    TableBank: Table Benchmark for Image-based Table Detection and Recognition

    VIEW 4 EXCERPTS
    CITES METHODS
    HIGHLY INFLUENCED

    Multi-Modal Association based Grouping for Form Structure Extraction

    VIEW 1 EXCERPT
    CITES METHODS

    References

    Publications referenced by this paper.
    SHOWING 1-2 OF 2 REFERENCES

    PDFFigures 2.0: Mining figures from research papers

    VIEW 13 EXCERPTS
    HIGHLY INFLUENTIAL

    Deep Residual Learning for Image Recognition

    VIEW 7 EXCERPTS
    HIGHLY INFLUENTIAL