Class-Specific Ensembles for Active Learning in Digital Imagery

Abstract

In many real-world tasks of image classification, limited amounts of labeled data are available to train automatic classifiers. Consequently, extensive human expert involvement is required for instance labeling. Detecting Egeria densa in digital imagery is one such real-world classification task. It presents an additional challenge due to subtle spectral changes in Egeria, which makes it difficult to find a single accurate classifier. A novel solution is proposed to employ an ensemble of classifiers for each class (class-specific ensembles), combined with an active learning scheme. The class-specific ensembles are implicitly diverse. Diversity is required to increase the overall accuracy when combining predictions. The combined predictions of the ensembles can be used to reduce the uncertainty in detecting Egeria. Iterative active learning is then suggested to adapt the ensembles to the new images, unseen to the active learner. A novel solution to build compact ensembles is also presented , which are needed to expedite the retraining of the active learner. The combined results are accurate and compact ensembles, which require significantly less expert involvement for image region classification. 1 Introduction Multimedia content is rapidly becoming a major target for data mining research. This paper is concerned with image mining-discovering patterns and knowledge from images for the purpose of classifying images or for similarity matching between images. The specific problem we address is image region classification. Egeria densa is an exotic submerged aquatic weed causing navigation and reservoir-pumping problems in the Sacramento-San Joaquin Delta of Northern California. As a part of a control program to manage Egeria, classification of regions in aerial images is required. This problem can be abstracted to one of classifying massive data without class labels. Relying on human experts for class labeling is not only time-consuming and costly, but also unreliable if the experts are overburdened with minute and routine tasks. Massive manual classifica

Extracted Key Phrases

7 Figures and Tables