Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

  title={Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols},
  author={Andrew Head and Kyle Lo and Dongyeop Kang and Raymond Fok and Sam Skjonsberg and Daniel S. Weld and Marti A. Hearst},
  journal={Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems},
Despite the central importance of research papers to scientific progress, they can be difficult to read. Comprehension is often stymied when the information needed to understand a passage resides somewhere else—in another section, or in another paper. In this work, we envision how interfaces can bring definitions of technical terms and symbols to readers when and where they need them most. We introduce ScholarPhi, an augmented reading interface with four novel features: (1) tooltips that… Expand

Figures and Tables from this paper

Document-Level Definition Detection in Scholarly Documents: Existing Models, Error Analyses, and Future Directions
This paper develops a new definition detection system, HEDDEx, that utilizes syntactic features, transformer encoders, and heuristic filters, and evaluates it on a standard sentence-level benchmark and notes that performance on the high-recall document-level task is much lower than in the standard evaluation approach. Expand
PAWLS: PDF Annotation With Labels and Structure
This paper presents PDF Annotation with Labels and Structure (PAWLS), a new annotation tool designed specifically for the PDF document format, particularly suited for mixed-mode annotation and scenarios in which annotators require extended context to annotate accurately. Expand
A system for information extraction from scientific texts in Russian
In this paper, we present a system for information extraction from scientific texts in the Russian language. The system performs several tasks in an end-to-end manner: term recognition, extraction ofExpand


Beyond paper: supporting active reading with free form digital ink annotations
The XLibris “active reading machine” demonstrates that computers can help active readers organize and find information while retaining many of the advantages of reading on paper. Expand
Enriching a document collection by integrating information extraction and PDF annotation
A high-accuracy citation extraction algorithm which significantly improves on earlier reported techniques, and a technique for integrating PDF processing with a conventional text-stream based information extraction pipeline. Expand
Mathematical Language Processing Project
Two approaches to discover identifier-definition tuples are compared and a simple pattern matching approach is used and a approach that uses part-of-speech tag based distances as well as sentence positions to calculate identifier- definition probabilities is presented. Expand
PaperQuest: A Visualization Tool to Support Literature Review
PaperQuest is presented, a visualization tool that supports efficient reading decisions, by only displaying the information useful at a given step of the review, in order to find and sort papers that are likely to be relevant to users. Expand
SideNoter: Scholarly Paper Browsing System based on PDF Restructuring and Text Annotation
This system provides ways to extract natural language sentences from PDF files together with their logical structures, and also to map arbitrary textual spans to their corresponding regions on page images, and is planned to make widely available to NLP researchers. Expand
The reader's helper: a personalized document reading environment
A anew document reading environment is introduced called the Readers HelperTM, which supports the reading of electronic and paper documents and produces arelevance score for each of the readers topics of interest, thereby helping the reader decide whether the document is actually worthskimming or reading. Expand
Schema-laden Purposes and Purpose-laden Schema
    Just as a scientist writes as part of an active life within a research community, the scientist reads as part of the continuing activity of research. If texts are not-cannot be-produced by the simpleExpand
    Elastic Documents: Coupling Text and Tables through Contextual Visualizations for Enhanced Document Reading
    This paper parse the text content and data tables, cross-link the components using a keyword-based matching algorithm, and generate on-demand visualizations based on the reader's current focus within a document that couples text content with data tables contained in the document. Expand
    Mining Scientific Terms and their Definitions: A Study of the ACL Anthology
    DefMiner is presented, a supervised sequence labeling system that identifies scientific terms and their accompanying definitions and achieves 85% F1 on a Wikipedia benchmark corpus, significantly improving the previous state-of-the-art by 8%. Expand
    Hypertext: An Introduction and Survey
    A survey of existing hypertext systems, their applications, and their design is both an introduction to the world of hypertext and a survey of some of the most important design issues that go into fashioning a hypertext environment. Expand