A Large Dataset of Historical Japanese Documents with Complex Layouts
- Zejiang Shen, Kaixuan Zhang, Melissa Dell
- Computer ScienceIEEE/CVF Conference on Computer Vision and…
- 18 April 2020
This work presents HJDataset, a Large Dataset of Historical Japanese Documents with Complex Layouts, a large-scale dataset that contains over 250,000 layout element annotations of seven types and demonstrates the usefulness of the dataset on real-world document digitization tasks.
LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis
- Zejiang Shen, Ruochen Zhang, Melissa Dell, B. Lee, Jacob Carlson, Weining Li
- Computer ScienceIEEE International Conference on Document…
- 29 March 2021
The core LayoutParser library comes with a set of simple and intuitive interfaces for applying and customizing DL models for layout detection, character recognition, and many other document processing tasks and incorporates a community platform for sharing both pre-trained models and full document digitization pipelines.
Deep Learning based Framework for Automatic Damage Detection in Aircraft Engine Borescope Inspection
- Zejiang Shen, Xili Wan, Feng Ye, Xinjie Guan, S. Liu
- Computer ScienceInternational Conference on Computing, Networking…
- 1 February 2019
A deep learning based framework is proposed which utilizes the state-of-the-art algorithm called Fully Convolutional Networks (FCN) to identify and locate damages from borescope images to identify two major types of damages, namely crack and burn.
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups
- Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey
- Computer ScienceInternational Conference on Topology, Algebra and…
- 1 June 2021
This work introduces new methods that explicitly model VIsual LAyout (VILA) groups, that is, text lines or text blocks, to further improve performance and shows that simply inserting special tokens denoting layout group boundaries into model inputs can lead to a 1.9% Macro F1 improvement in token classification.
Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search
- Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, Doug Downey
- Computer ScienceArXiv
- 16 March 2022
PINOCCHIO is presented, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
Information Extraction from Text Regions with Complex Tabular Structure
- Kaixuan Zhang, Zejiang Shen, Jie Zhou, Melissa Dell
- Computer Science
- 14 September 2019
This paper presents a new dataset with complex tabular structure, and proposes new methods to robustly retrieve information from the complex text region.
Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities
- Zejiang Shen, Kyle Lo, L. Yu, N. Dahlberg, Margo Schlanger, Doug Downey
- Computer ScienceArXiv
- 22 June 2022
Multi-LexSum, a collection of 9,280 expert-authored summaries drawn from ongoing CRLC writing, is introduced, demonstrating that despite the high-quality summaries in the training data, state-of-the-art summarization models perform poorly on this task.
PAWLS: PDF Annotation With Labels and Structure
- Mark Neumann, Zejiang Shen, Sam Skjonsberg
- Computer ScienceAnnual Meeting of the Association for…
- 25 January 2021
This paper presents PDF Annotation with Labels and Structure (PAWLS), a new annotation tool designed specifically for the PDF document format, particularly suited for mixed-mode annotation and scenarios in which annotators require extended context to annotate accurately.
Incorporating Visual Layout Structures for Scientific Text Classification
- Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey
- Computer ScienceArXiv
- 2021
This work introduces new methods for incorporating VIsual LAyout (VILA) structures, e.g., the grouping of page texts into text lines or text blocks, into language models to further improve performance and designs a hierarchical model, H-VILA, that encodes the text based on layout structures.
OLALA: Object-Level Active Learning for Efficient Document Layout Annotation
- Zejiang Shen, Jian Zhao, Melissa Dell, Yaoliang Yu, Weining Li
- Computer Science
- 5 October 2020
An Object-Level Active Learning framework for efficient document layout Annotation, OLALA, where only regions with the most ambiguous object predictions within an image are selected for annotators to label, optimizing the use of the annotation budget.
...
...