Extracting Scientific Figures with Distantly Supervised Neural Networks

@inproceedings{Siegel2018ExtractingSF,
  title={Extracting Scientific Figures with Distantly Supervised Neural Networks},
  author={Noah Siegel and Nicholas Lourie and Russell Power and Waleed Ammar},
  booktitle={JCDL},
  year={2018}
}
Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction. In this paper, we induce high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention. To accomplish this we leverage the auxiliary data provided in two large web collections of scientific… CONTINUE READING
Related Discussions
This paper has been referenced on Twitter 27 times. VIEW TWEETS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • We share the resulting dataset of over 5.5 million induced labels---4,000 times larger than the previous largest figure extraction dataset---with an average precision of 96.8%, to enable the development of modern data-driven methods for this task.

Citations

Publications citing this paper.

References

Publications referenced by this paper.
Showing 1-2 of 2 references

PDFFigures 2.0: Mining figures from research papers

2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL) • 2016
View 13 Excerpts
Highly Influenced

Deep Residual Learning for Image Recognition

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) • 2016
View 7 Excerpts
Highly Influenced