• Corpus ID: 233481964

Improving the Accessibility of Scientific Documents: Current State, User Needs, and a System Solution to Enhance Scientific PDF Accessibility for Blind and Low Vision Users

  title={Improving the Accessibility of Scientific Documents: Current State, User Needs, and a System Solution to Enhance Scientific PDF Accessibility for Blind and Low Vision Users},
  author={Lucy Lu Wang and Isabel Cachola and Jonathan Bragg and Evie (Yu-Yen) Cheng and Chelsea Hess Haupt and Matt Latzke and Bailey Kuehl and Madeleine van Zuylen and Linda M. Wagner and Daniel S. Weld},
The majority of scientific papers are distributed in PDF, which pose challenges for accessibility, especially for blind and low vision (BLV) readers. We characterize the scope of this problem by assessing the accessibility of 11,397 PDFs published 2010--2019 sampled across various fields of study, finding that only 2.4% of these PDFs satisfy all of our defined accessibility criteria. We introduce the SciA11y system to offset some of the issues around inaccessibility. SciA11y incorporates… 

Exploring Team-Sourced Hyperlinks to Address Navigation Challenges for Low-Vision Readers of Scientific Papers

It may be possible for readers of all abilities to organically leave traces in papers, and that these traces can be used to facilitate navigation tasks, in particular for low-vision readers.

SciA11y: Converting Scientific Papers to Accessible HTML

SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes the resulting paper components into a form that better supports skimming and scanning for blind and low vision readers.

A Dataset of Alt Texts from HCI Publications

It is found that the capacity of author-written alt text to fulfll blind and low vision user needs is mixed; for example, only 50% of alt texts in the authors' sample contain information about extrema or outliers, and only 31% containInformation about major trends or comparisons conveyed by the graph.

The Accessibility of Data Visualizations on the Web for Screen Reader Users: Practices and Experiences During COVID-19

Observations during this critical period of time provide an understanding of the widespread accessibility issues encountered across online data visualizations, the impact that data accessibility inequities have on the BVI community, the ways screen reader users sought access to data-driven information and made use of online visualizations to form insights, and the pressing need to make larger strides towards improving data literacy, building confidence, and enriching methods of access.

Author Reflections on Creating Accessible Academic Papers

Academic papers demonstrate inaccessibility despite accessible writing resources made available by SIGACCESS and others. The move from accessibility guidance to accessibility implementation is

Towards Optimizing OCR for Accessibility

Visual cues such as structure, emphasis, and icons play an important role in efficient information foraging by sighted individuals and make for a pleasurable reading experience. Blind, low-vision and

Document Navigability: A Need for Print-Impaired

This paper proposes a vision based technique to locate the referenced content and extract metadata needed to inline a content sum-mary into the audio narration and applies it to citations in scientific documents and works well both on born-digital as well as scanned documents.

Development and Evaluation of a Tool for Assisting Content Creators in Making PDF Files More Accessible

The approaches taken in Ally improve the ability to create accessible PDFs efficiently and accurately for the four important aspects studied, but future work will need to incorporate additional functionality, related to remediating alt text, forms, and other aspects of PDF accessibility.

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups

This work introduces new methods that explicitly model VIsual LAyout (VILA) groups, that is, text lines or text blocks, to further improve performance and shows that simply inserting special tokens denoting layout group boundaries into model inputs can lead to a 1.9% Macro F1 improvement in token classification.

Incorporating Visual Layout Structures for Scientific Text Classification

This work introduces new methods for incorporating VIsual LAyout (VILA) structures, e.g., the grouping of page texts into text lines or text blocks, into language models to further improve performance and designs a hierarchical model, H-VILA, that encodes the text based on layout structures.



Extracting Scientific Figures with Distantly Supervised Neural Networks

This paper induces high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention, and uses this dataset to train a deep neural network for end-to-end figure detection, yielding a model that can be more easily extended to new domains compared to previous work.

Making the field of computing more inclusive

More accessible conferences, digital resources, and ACM SIGs will lead to greater participation by more people with disabilities.

S2ORC: The Semantic Scholar Open Research Corpus

In S2ORC, a large corpus of 81.1M English-language academic papers spanning many academic disciplines is introduced, which is expected to facilitate research and development of tools and tasks for text mining over academic text.

Use of Ranks in One-Criterion Variance Analysis

Abstract Given C samples, with n i observations in the ith sample, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i

Web Content Accessibility Guidelines (WCAG) 2.0

Web Content Accessibility Guidelines (WCAG) 2.0 covers a wide range of recommendations for making Web content more accessible to a wider range of people with disabilities, including blindness and low vision, deafness and hearing loss, learning disabilities, limited movement, and more.

An Uninteresting Tour Through Why Our Research Papers Aren't Accessible

The context in which PDFs became their publication format, the difficulty in making PDF documents accessible given current tools, what the authors have tried to make their PDFs more accessible, and potential options for doing better in the future are overviewed.

Creating accessible PDFs for conference proceedings

The accessibility of 1,811 papers in the technical program of several top conferences related to accessibility and human-computer interaction and thoughts on research challenges and future work that may make the community's research more accessible are reported on.

How science should support researchers with visual impairments.

Naheda Sahtout says being legally blind doesn’t fundamentally affect her skills, and argues that science needs to start a conversation to attract and empower more researchers like her. Naheda Sahtout

Twitter A11y: A Browser Extension to Make Twitter Images Accessible

Twitter A11y increases access to social media platforms for people with visual impairments by providing high-quality automatic descriptions for user-posted images by increasing alt-text coverage from 7.6% to 78.5%, before crowdsourcing descriptions for the remaining images.

A Formative Study on Designing Accurate and Natural Figure Captioning Systems

This work crawled, annotated, and analyzed a corpus of real-world human-written figure captions, showing that real- world captions usually consist of a finite set of caption units and that automatic figure captioning should be formulated as a multi-stage task.