Learn More
In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include(More)
Part-of-speech (POS) tag distributions are known to exhibit sparsity — a word is likely to take a single predominant tag in a corpus. Recent research has demonstrated that incorporating this sparsity constraint improves tagging accuracy. However, in existing systems, this expansion come with a steep increase in model complexity. This paper proposes a simple(More)
We participated in the SENSEVAL-3 English lexical sample task and multilingual lexical sample task. We adopted a supervised learning approach with Support Vector Machines, using only the official training data provided. No other external resources were used. The knowledge sources used were part-of-speech of neighboring words, single words in the surrounding(More)
Conceptual graphs are very useful for representing structured knowledge. However, existing formulations of fuzzy conceptual graphs are not suitable for matching images of natural scenes. This paper presents a new variation of fuzzy conceptual graphs that is more suited to image matching. This variant differentiates between a model graph that describes a(More)
In this paper, we present a learning approach to the scenario template task of information extraction, where information filling one template could come from multiple sentences. When tested on the MUC-4 task, our learning approach achieves accuracy competitive to the best of the MUC-4 systems, which were all built with manually engineered rules. Our(More)
A method for automatic analysis of time-oriented clinical narratives would be of significant practical import for medical decision making, data modeling and biomedical research. This paper proposes a robust corpus-based approach for temporal analysis of medical discharge summaries. We characterize temporal organization of clinical narratives in terms of(More)
The connection between part-of-speech (POS) categories and morphological properties is well-documented in linguistics but underuti-lized in text processing systems. This paper proposes a novel model for morphological segmentation that is driven by this connection. Our model learns that words with common affixes are likely to be in the same syntactic(More)
If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-the-art Arabic-to-English(More)