• Corpus ID: 222133860

OLALA: Object-Level Active Learning Based Layout Annotation

  title={OLALA: Object-Level Active Learning Based Layout Annotation},
  author={Zejiang Shen and Jian Zhao and Melissa Dell and Yaoliang Yu and Weining Li},
In layout object detection problems, the ground-truth datasets are constructed by annotating object instances individually. Yet active learning for object detection is typically conducted at the image level, not at the object level. Because objects appear with different frequencies across images, image-level active learning may be subject to over-exposure to common objects. This reduces the efficiency of human labeling. This work introduces an Object-Level Active Learning based Layout… 

Figures and Tables from this paper

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis

The core LayoutParser library comes with a set of simple and intuitive interfaces for applying and customizing DL models for layout detection, character recognition, and many other document processing tasks and incorporates a community platform for sharing both pre-trained models and full document digitization pipelines.



Deep active learning for object detection

A novel active learning method is developed which poses the layered architecture used in object detection as a ‘query by committee’ paradigm to choose the set of images to be queried and these methods outperform classical uncertainty-based active learning algorithms like maximum entropy.

Consistency-based Semi-supervised Learning for Object detection

A Consistency-based Semi-supervised learning method for object Detection (CSD), which is a way of using consistency constraints as a tool for enhancing detection performance by making full use of available unlabeled data.

Microsoft COCO: Common Objects in Context

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene

Active Learning for Deep Detection Neural Networks

A new image-level scoring process to rank unlabeled images for their automatic selection, which clearly outperforms classical scores is proposed and can be applied to videos and sets of still images.

Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection

This paper presents a principled Self-supervised Sample Mining (SSM) process accounting for the real challenges in object detection, and proposes a new AL framework for gradually incorporating unlabeled or partially labeled data into the model learning while minimizing the annotating effort of users.

Interactive object detection

An interactive object annotation method that incrementally trains an object detector while the user provides annotations that gives live feedback to the user by detecting objects on the fly and predicts the potential annotation costs of unseen images.

Active Learning for Deep Object Detection

A novel method of active learning for object detection with an incremental learning scheme to enable continuous exploration of new unlabeled datasets and a set of uncertainty-based active learning metrics suitable for most object detectors are proposed.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

A Large Dataset of Historical Japanese Documents with Complex Layouts

This work presents HJDataset, a Large Dataset of Historical Japanese Documents with Complex Layouts, a large-scale dataset that contains over 250,000 layout element annotations of seven types and demonstrates the usefulness of the dataset on real-world document digitization tasks.

READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

This paper collects and annotates 2036 archival document images from different locations and time periods and proposes a new evaluation scheme that is based on baselines, which has no need for binarization and it can handle skewed as well as rotated text lines.