FigureSeer: Parsing Result-Figures in Research Papers

  title={FigureSeer: Parsing Result-Figures in Research Papers},
  author={Noah Siegel and Zachary Horvitz and Roie Levin and Santosh Kumar Divvala and Ali Farhadi},
  booktitle={European Conference on Computer Vision},
‘Which are the pedestrian detectors that yield a precision above 95 % at 25 % recall. [] Key Method The key challenge in analyzing the figure content is the extraction of the plotted data and its association with the legend entries. We address this challenge by formulating a novel graph-based reasoning approach using a CNN-based similarity metric. We present a thorough evaluation on a real-word annotated dataset to demonstrate the efficacy of our approach.

Diag2graph: Representing Deep Learning Diagrams In Research Papers As Knowledge Graphs

Diag2Graph is introduced, an end-to-end framework for parsing deep learning diagram-figures that enables powerful search and retrieval of architectural details in research papers and is represented in the form of a deep knowledge graph.

ACL-Fig: A Dataset for Scientific Figure Classification

A pipeline that extracts figures and tables from the scientific literature and a deep-learning-based framework that classifies scientific figures using visual features is developed and the first large-scale automatically annotated corpus is built, ACL-F IG, consisting of 112,052 scientific figures extracted from 56 K research papers in the ACL Anthology.

Figure Captioning in Scholarly Literatures to Augment Search Results

A novel end-to-end framework for scholarly figure captioning that can effectively generate captions for figures under several metrics and enable a variety of current exciting applications, such as figure search engine and figure query answering.

FigureQA: An Annotated Figure Dataset for Visual Reasoning

FigureQA is envisioned as a first step towards developing models that can intuitively recognize patterns from visual representations of data, and preliminary results indicate that the task poses a significant machine learning challenge.

Extracting Scientific Figures with Distantly Supervised Neural Networks

This paper induces high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention, and uses this dataset to train a deep neural network for end-to-end figure detection, yielding a model that can be more easily extended to new domains compared to previous work.

Scientific Chart Summarization: Datasets and Improved Text Modeling

This paper studies the chart summarization problem in which the goal is to generate sentences that describe the salient information in a chart image, and proposes a model that not only uses a standard visual encoder but also a text encoder to encode achart image.

CTE: A Dataset for Contextualized Table Extraction

The task of Contextualized Table Extraction (CTE) is defined, which aims to extract and define the structure of tables considering the textual context of the document, and the dataset can support CTE and adds new classes to the original ones.

Figure Captioning with Relation Maps for Reasoning

This work investigates the problem of figure caption generation where the goal is to automatically generate a natural language description for a given figure, and introduces a dataset FigCAP and proposes novel attention mechanism to solve the exposure bias issue.

Line Chart Understanding with Convolutional Neural Network

A problem definition for explicitly understanding knowledge in a line chart is proposed and an algorithm for generating supervised data that are easy to share and scale-up is provided to provide a separate and scalable environment to enhance research into technical document understanding.



Automatic Extraction of Figures from Scholarly Documents

The challenges of how to build a heuristic independent trainable model for such an extraction task and how to extract figures at scale are discussed and three new evaluation metrics are defined: figure-precision, figure-recall, and figure-F1-score are defined.

An Architecture for Information Extraction from Figures in Digital Libraries

A modular architecture for analyzing multiple figures representing experimental findings, an extractor algorithm to extract vector graphics from scholarly documents and a classification algorithm for figures which is very scalable, yet achieves 85\% accuracy are proposed.

Looking Beyond Text: Extracting Figures, Tables and Captions from Computer Science Papers

This work introduces a new dataset of 150 computer science papers along with ground truth labels for the locations of the figures, tables and captions within them and demonstrates a caption-to-figure matching component that is effective even in cases where individual captions are adjacent to multiple figures.

A figure search engine architecture for a chemistry digital library

This work gives the frame work for the extraction algorithm, architecture and ranking function, and indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor.

The Pascal Visual Object Classes Challenge: A Retrospective

A review of the Pascal Visual Object Classes challenge from 2008-2012 and an appraisal of the aspects of the challenge that worked well, and those that could be improved in future challenges.

Mind's eye: A recurrent visual representation for image caption generation

This paper explores the bi-directional mapping between images and their sentence-based descriptions with a recurrent neural network that attempts to dynamically build a visual representation of the scene as a caption is being generated or read.

ImageNet: A large-scale hierarchical image database

A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

Improving state-of-the-art OCR through high-precision document-specific modeling

This work uses the state-of-the-art OCR system Tesseract to produce an initial translation, and uses this initial translation to bootstrap document-specific character models, which are able to reduce the error over properly segmented characters by 34.1% overall.

Towards retrieving relevant information graphics

This paper presents the first steps toward a system for retrieving bar charts and line graphs that reasons about the content of the graphic itself in deciding its relevance to the user query, and achieves accuracy higher than 80\% on a corpus of collected user queries.

ReVision: automated classification, analysis and redesign of chart images

ReVision is a system that automatically redesigns visualizations to improve graphical perception, and applies perceptually-based design principles to populate an interactive gallery of redesigned charts.