Allen Institute for AI
Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis
- Zejiang Shen, Ruochen Zhang, Melissa Dell, B. Lee, Jacob Carlson, Weining Li
- Computer ScienceICDAR
- 29 March 2021
The core LayoutParser library comes with a set of simple and intuitive interfaces for applying and customizing DL models for layout detection, character recognition, and many other document processing tasks and incorporates a community platform for sharing both pre-trained models and full document digitization pipelines.
A Large Dataset of Historical Japanese Documents with Complex Layouts
- Zejiang Shen, Kaixuan Zhang, Melissa Dell
- Computer ScienceIEEE/CVF Conference on Computer Vision and…
- 18 April 2020
This work presents HJDataset, a Large Dataset of Historical Japanese Documents with Complex Layouts, a large-scale dataset that contains over 250,000 layout element annotations of seven types and demonstrates the usefulness of the dataset on real-world document digitization tasks.
Information Extraction from Text Regions with Complex Tabular Structure
This paper presents a new dataset with complex tabular structure, and proposes new methods to robustly retrieve information from the complex text region.
Deep Learning based Framework for Automatic Damage Detection in Aircraft Engine Borescope Inspection
- Zejiang Shen, Xili Wan, Feng Ye, Xinjie Guan, S. Liu
- Computer ScienceInternational Conference on Computing, Networking…
- 1 February 2019
A deep learning based framework is proposed which utilizes the state-of-the-art algorithm called Fully Convolutional Networks (FCN) to identify and locate damages from borescope images to identify two major types of damages, namely crack and burn.
OLALA: Object-Level Active Learning Based Layout Annotation
- Zejiang Shen, Jian Zhao, Melissa Dell, Yaoliang Yu, Weining Li
- Computer Science, MathematicsArXiv
- 5 October 2020
This work introduces an Object-Level Active Learning based Layout Annotation framework, OLALA, which includes an object scoring method and a prediction correction algorithm that selects only the most ambiguous object prediction regions within an image for annotators to label, optimizing the use of the annotation budget.
Incorporating Visual Layout Structures for Scientific Text Classification
- Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey
- Computer ScienceArXiv
This work introduces new methods for incorporating VIsual LAyout (VILA) structures, e.g., the grouping of page texts into text lines or text blocks, into language models to further improve performance and designs a hierarchical model, H-VILA, that encodes the text based on layout structures.
OLALA: Object-Level Active Learning for Efficient Document Layout Annotation
An Object-Level Active Learning framework for efficient document layout Annotation, OLALA, where only regions with the most ambiguous object predictions within an image are selected for annotators to label, optimizing the use of the annotation budget.
PAWLS: PDF Annotation With Labels and Structure
This paper presents PDF Annotation with Labels and Structure (PAWLS), a new annotation tool designed specifically for the PDF document format, particularly suited for mixed-mode annotation and scenarios in which annotators require extended context to annotate accurately.
Improving Unpaired Object Translation for Unaligned Domains
Generative Adversarial Networks have shown promise in unpaired image translation. However, translating unpaired objects from unaligned domains is an unsolved problem. Existing methods are restricted…
Generating Object Stamps
- Youssef A. Mejjati, Zejiang Shen, +4 authors K. Kim
- Computer Science, EngineeringArXiv
- 1 January 2020
An algorithm to generate diverse foreground objects and composite them into background images using a GAN architecture that allows for improved overall quality and diversity compared to state-of-the-art object insertion approaches.