Active Learning for Networked Data


We introduce a novel active learning algorithm for classification of network data. In this setting, training instances are connected by a set of links to form a network, the labels of linked nodes are correlated, and the goal is to exploit these dependencies and accurately label the nodes. This problem arises in many domains, including social and biological network analysis and document classification, and there has been much recent interest in methods that collectively classify the nodes in the network. While in many cases labeled examples are expensive, often network information is available. We show how an active learning algorithm can take advantage of network structure. Our algorithm effectively exploits the links between instances and the interaction between the local and collective aspects of a classifier to improve the accuracy of learning from fewer labeled examples. We experiment with two real-world benchmark collective classification domains, and show that we are able to achieve extremely accurate results even when only a small fraction of the data is labeled.

View Slides

Extracted Key Phrases

4 Figures and Tables

Citations per Year

132 Citations

Semantic Scholar estimates that this publication has 132 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Bilgic2010ActiveLF, title={Active Learning for Networked Data}, author={Mustafa Bilgic and Lilyana Mihalkova and Lise Getoor}, booktitle={ICML}, year={2010} }