FigureSeer: Parsing Result-Figures in Research Papers

@inproceedings{Siegel2016FigureSeerPR,
  title={FigureSeer: Parsing Result-Figures in Research Papers},
  author={Noah Siegel and Zachary Horvitz and Roie Levin and Santosh Kumar Divvala and Ali Farhadi},
  booktitle={ECCV},
  year={2016}
}
‘Which are the pedestrian detectors that yield a precision above 95 % at 25 % recall. [] Key Method The key challenge in analyzing the figure content is the extraction of the plotted data and its association with the legend entries. We address this challenge by formulating a novel graph-based reasoning approach using a CNN-based similarity metric. We present a thorough evaluation on a real-word annotated dataset to demonstrate the efficacy of our approach.

Diag2graph: Representing Deep Learning Diagrams In Research Papers As Knowledge Graphs

TLDR
Diag2Graph is introduced, an end-to-end framework for parsing deep learning diagram-figures that enables powerful search and retrieval of architectural details in research papers and is represented in the form of a deep knowledge graph.

Figure Captioning in Scholarly Literatures to Augment Search Results

TLDR
A novel end-to-end framework for scholarly figure captioning that can effectively generate captions for figures under several metrics and enable a variety of current exciting applications, such as figure search engine and figure query answering.

FigureQA: An Annotated Figure Dataset for Visual Reasoning

TLDR
FigureQA is envisioned as a first step towards developing models that can intuitively recognize patterns from visual representations of data, and preliminary results indicate that the task poses a significant machine learning challenge.

Extracting Scientific Figures with Distantly Supervised Neural Networks

TLDR
This paper induces high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention, and uses this dataset to train a deep neural network for end-to-end figure detection, yielding a model that can be more easily extended to new domains compared to previous work.

Scientific Chart Summarization: Datasets and Improved Text Modeling

TLDR
This paper studies the chart summarization problem in which the goal is to generate sentences that describe the salient information in a chart image, and proposes a model that not only uses a standard visual encoder but also a text encoder to encode achart image.

Figure Captioning with Relation Maps for Reasoning

TLDR
This work investigates the problem of figure caption generation where the goal is to automatically generate a natural language description for a given figure, and introduces a dataset FigCAP and proposes novel attention mechanism to solve the exposure bias issue.

Line Chart Understanding with Convolutional Neural Network

TLDR
A problem definition for explicitly understanding knowledge in a line chart is proposed and an algorithm for generating supervised data that are easy to share and scale-up is provided to provide a separate and scalable environment to enhance research into technical document understanding.

SciCap: Generating Captions for Scientific Figures

TLDR
This paper introduces S CI C AP, a large-scale figure-caption dataset based on computer science arXiv papers published between 2010 and 2020 and proposes an end-to-end neural framework to automatically generate informa-tive, high-quality captions for scientific flgures.

FigExplorer: A System for Retrieval and Exploration of Figures from Collections of Research Articles

TLDR
The FigExplorer system was designed to facilitate the collection of user data for training and test purposes and it is flexible enough such that it can be extended to include new functions and algorithms.
...

References

SHOWING 1-10 OF 54 REFERENCES

Automatic Extraction of Figures from Scholarly Documents

TLDR
The challenges of how to build a heuristic independent trainable model for such an extraction task and how to extract figures at scale are discussed and three new evaluation metrics are defined: figure-precision, figure-recall, and figure-F1-score are defined.

An Architecture for Information Extraction from Figures in Digital Libraries

TLDR
A modular architecture for analyzing multiple figures representing experimental findings, an extractor algorithm to extract vector graphics from scholarly documents and a classification algorithm for figures which is very scalable, yet achieves 85\% accuracy are proposed.

A figure search engine architecture for a chemistry digital library

TLDR
This work gives the frame work for the extraction algorithm, architecture and ranking function, and indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor.

The Pascal Visual Object Classes Challenge: A Retrospective

TLDR
A review of the Pascal Visual Object Classes challenge from 2008-2012 and an appraisal of the aspects of the challenge that worked well, and those that could be improved in future challenges.

Mind's eye: A recurrent visual representation for image caption generation

TLDR
This paper explores the bi-directional mapping between images and their sentence-based descriptions with a recurrent neural network that attempts to dynamically build a visual representation of the scene as a caption is being generated or read.

ImageNet: A large-scale hierarchical image database

TLDR
A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

Improving state-of-the-art OCR through high-precision document-specific modeling

TLDR
This work uses the state-of-the-art OCR system Tesseract to produce an initial translation, and uses this initial translation to bootstrap document-specific character models, which are able to reduce the error over properly segmented characters by 34.1% overall.

Towards retrieving relevant information graphics

TLDR
This paper presents the first steps toward a system for retrieving bar charts and line graphs that reasons about the content of the graphic itself in deciding its relevance to the user query, and achieves accuracy higher than 80\% on a corpus of collected user queries.

ReVision: automated classification, analysis and redesign of chart images

TLDR
ReVision is a system that automatically redesigns visualizations to improve graphical perception, and applies perceptually-based design principles to populate an interactive gallery of redesigned charts.

3D Wikipedia: Using online text to automatically abel and navigate reconstructed geometry

TLDR
An approach for analyzing Wikipedia and other text, together with online photos, to produce annotated 3D models of famous tourist sites is introduced, which enables a number of new interactions, which are demonstrated in a new 3D visualization tool.
...