Extracting Scientific Figures with Distantly Supervised Neural Networks

@inproceedings{Siegel2018ExtractingSF,
  title={Extracting Scientific Figures with Distantly Supervised Neural Networks},
  author={Noah Siegel and Nicholas Lourie and Russell Power and Waleed Ammar},
  booktitle={JCDL},
  year={2018}
}
Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction. In this paper, we induce high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention. To accomplish this we leverage the auxiliary data provided in two large web collections of scientific… CONTINUE READING
Recent Discussions
This paper has been referenced on Twitter 27 times over the past 90 days. VIEW TWEETS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • We share the resulting dataset of over 5.5 million induced labels---4,000 times larger than the previous largest figure extraction dataset---with an average precision of 96.8%, to enable the development of modern data-driven methods for this task.

Citations

Publications citing this paper.

References

Publications referenced by this paper.
Showing 1-2 of 2 references

Similar Papers

Loading similar papers…