Learn More
There are several ways in which the algorithmic acquisition of language knowledge and behavior can be studied. One important area of research is the computational modeling of human language acquisition using statistical, machine learning, or neural network methods. See Broeder and Murre (2000) for a recent collection of this type of research. And there is(More)
We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approaches are useful when a tagged corpus is available as an(More)
GAMBL is a word expert approach to WSD in which each word expert is trained using memory-based learning. Joint feature selection and algorithm parameter optimization are achieved with a genetic algorithm (GA). We use a cascaded classi-fier approach in which the GA optimizes local context features and the output of a separate keyword classifier (rather than(More)
We show that in language learning, contrary to received wisdom, keeping exceptional training instances in memory can be beneecial for generalization accuracy. We investigate this phenomenon empirically on a selection of benchmark natural language processing tasks: grapheme-to-phoneme conversion, part-of-speech tagging, prepositional-phrase attachment, and(More)
We examine how differences in language models, learned by different data-driven systems performing the same NLP task, can be exploited to yield a higher accuracy than the best individual system. We do this by means of experiments involving the task of morphosyntactic word class tagging, on the basis of three different tagged corpora. Four well-known tagger(More)
We describe TADPOLE, a modular memory-based morphosyntactic tagger and dependency parser for Dutch. Though primarily aimed at being accurate, the design of the system is also driven by optimizing speed and memory usage, using a trie-based approximation of k-nearest neighbor classification as the basis of each module. We perform an evaluation of its three(More)