Scim: Intelligent Faceted Highlights for Interactive, Multi-Pass Skimming of Scientific Papers

  title={Scim: Intelligent Faceted Highlights for Interactive, Multi-Pass Skimming of Scientific Papers},
  author={Raymond Fok and Andrew Head and Jonathan Bragg and Kyle Lo and Marti A. Hearst and Daniel S. Weld},
Researchers are expected to keep up with an immense literature, yet often find it prohibitively time-consuming to do so. This paper ex-plores how intelligent agents can help scaffold in-situ information seeking across scientific papers. Specifically, we present Scim, an AI-augmented reading interface designed to help researchers skim papers by automatically identifying, classifying, and highlighting salient sentences, organized into rhetorical facets rooted in common information needs. Using… 
1 Citations

Figures from this paper

Threddy: An Interactive System for Personalized Thread-based Exploration and Organization of Scientific Literature

A tool integrated into users’ reading process that helps them with leveraging authors’ existing summarization of threads, typically in introduction or related work sections, in order to situate their own work’s contributions is developed.



HiText: Text Reading with Dynamic Salience Marking

HiText is presented, a simple yet effective way of dynamically marking parts of a document in accordance with their salience, which results in marked increases in user satisfaction and reading efficiency, as assessed using TOEFL-style reading comprehension tests.

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

This work introduces ScholarPhi, an augmented reading interface with four novel features: tooltips that surface position-sensitive definitions from elsewhere in a paper, a filter over the paper that “declutters” it to reveal how the term or symbol is used across the paper, automatic equation diagrams that expose multiple definitions in parallel, and an automatically generated glossary of important terms and symbols.

Wikum: Bridging Discussion Forums and Wikis Using Recursive Summarization

This article describes a workflow called recursive summarization, implemented in the Wikum prototype, that enables a large population of readers or editors to work in small doses to refine out the main points of the discussion.

Scientific Information Understanding via Open Educational Resources (OER)

A novel learning/reading environment, OER-based Collaborative PDF Reader (OCPR), that incorporates innovative scaffolding methods that can auto-characterize student emerging information need while reading a paper and enable students to readily access open educational resources (OER) based on their information need is proposed.

Extracting references between text and charts via crowdsourcing

A crowdsourcing pipeline is presented to extract the references between crowd workers paragraph-chart pairs and applies automated clustering and merging techniques to unify the references generated by multiple workers into a single set.

TLDR: Extreme Summarization of Scientific Documents

This work introduces SCITLDR, a new multi-target dataset of 5.4K TLDRs over 3.2K papers, and proposes CATTS, a simple yet effective learning strategy for generatingTLDRs that exploits titles as an auxiliary training signal.

ScentHighlights: highlighting conceptually-related sentences during reading

This work describes how it has enhanced skimming activity by conceptually highlighting sentences within electronic text that relate to search keywords by computing what conceptual keywords are related to each other via word co-occurrence and spreading activation.

Elastic Documents: Coupling Text and Tables through Contextual Visualizations for Enhanced Document Reading

This paper parse the text content and data tables, cross-link the components using a keyword-based matching algorithm, and generate on-demand visualizations based on the reader's current focus within a document that couples text content with data tables contained in the document.

Enriching a document collection by integrating information extraction and PDF annotation

A high-accuracy citation extraction algorithm which significantly improves on earlier reported techniques, and a technique for integrating PDF processing with a conventional text-stream based information extraction pipeline.

Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status

This article provides a gold standard for summaries of this kind consisting of a substantial corpus of conference articles in computational linguistics annotated with human judgments of the rhetorical status and relevance of each sentence in the articles.