• Publications
  • Influence
An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition
A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed, which generates an effective yet much smaller model, which is more practical for real-world application scenarios.
DOTA: A Large-Scale Dataset for Object Detection in Aerial Images
A large-scale Dataset for Object deTection in Aerial images (DOTA) is introduced and state-of-the-art object detection algorithms on DOTA are evaluated, demonstrating that DOTA well represents real Earth Vision applications and are quite challenging.
AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification
The Aerial Image data set (AID), a large-scale data set for aerial scene classification, is described to advance the state of the arts in scene classification of remote sensing images and can be served as the baseline results on this benchmark.
Detecting texts of arbitrary orientations in natural images
A system which detects texts of arbitrary orientations in natural images using a two-level classification scheme and two sets of features specially designed for capturing both the intrinsic characteristics of texts to better evaluate its algorithm and compare it with other competing algorithms.
ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
This work introduces ASTER, an end-to-end neural network model that comprises a rectification network and a recognition network that predicts a character sequence directly from the rectified image.
Multiple Instance Detection Network with Online Instance Classifier Refinement
This work formulate weakly supervised object detection as a Multiple Instance Learning (MIL) problem, where instance classifiers (object detectors) are put into the network as hidden nodes and instance labels inferred from weak supervision are propagated to their spatially overlapped instances to refine instance classifier online.
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no
Robust Scene Text Recognition with Automatic Rectification
RARE (Robust text recognizer with Automatic REctification), a recognition model that is robust to irregular text, which is end-to-end trainable, requiring only images and associated text labels, making it convenient to train and deploy the model in practical systems.
Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation
  • Z. Tu, X. Bai
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine…
  • 1 October 2010
The scope of the proposed algorithm goes beyond image analysis and it has the potential to be used for a wide variety of problems for structured prediction problems, including high-level vision and medical image segmentation problems.
Richer Convolutional Features for Edge Detection
The proposed network fully exploits multiscale and multilevel information of objects to perform the image-to-image prediction by combining all the meaningful convolutional features in a holistic manner and achieves state-of-the-art performance on several available datasets.