Yoong Keok Lee

Learn More
In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include(More)
In this paper, we present a framework and a system that extracts events relevant to a query from a collection C of documents, and places such events along a timeline. Each event is represented by a sentence extracted from C, based on the assumption that "important" events are widely cited in many documents for a period of time within which these events are(More)
We participated in the SENSEVAL-3 English lexical sample task and multilingual lexical sample task. We adopted a supervised learning approach with Support Vector Machines, using only the official training data provided. No other external resources were used. The knowledge sources used were partof-speech of neighboring words, single words in the surrounding(More)
Conceptual graphs are very useful for representing structured knowledge. However, existing formulations of fuzzy conceptual graphs are not suitable for matching images of natural scenes. This paper presents a new variation of fuzzy conceptual graphs that is more suited to image matching. This variant differentiates between a model graph that describes a(More)
The connection between part-of-speech (POS) categories and morphological properties is well-documented in linguistics but underutilized in text processing systems. This paper proposes a novel model for morphological segmentation that is driven by this connection. Our model learns that words with common affixes are likely to be in the same syntactic category(More)
In this paper, we present a learning approach to the scenario template task of information extraction, where information filling one template could come from multiple sentences. When tested on the MUC4 task, our learning approach achieves accuracy competitive to the best of the MUC-4 systems, which were all built with manually engineered rules. Our analysis(More)
A method for automatic analysis of time-oriented clinical narratives would be of significant practical import for medical decision making, data modeling and biomedical research. This paper proposes a robust corpus-based approach for temporal analysis of medical discharge summaries. We characterize temporal organization of clinical narratives in terms of(More)
If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English(More)