On Proper Unit Selection in Active Learning: Co-Selection Effects for Named Entity Recognition


Active learning is an effective method for creating training sets cheaply, but it is a biased sampling process and fails to explore large regions of the instance space in many applications. This can result in a missed cluster effect, which signficantly lowers recall and slows down learning for infrequent classes. We show that missed clusters can be avoided… (More)


6 Figures and Tables