Altaf Rahman

Learn More
We investigate new methods for creating and applying ensembles for coreference resolution. While existing ensembles for coreference resolution are typically created using different learning algorithms, clustering algorithms or training sets, we harness recent advances in coreference modeling and propose to create our ensemble from a variety of supervised(More)
Research in named entity recognition and mention detection has typically involved a fairly small number of semantic classes, which may not be adequate if semantic class information is intended to support natural language applications. Motivated by this observation, we examine the under-studied problem of semantic subtype induction, where the goal is to(More)
To build a coreference resolver for a new language, the typical approach is to first coreference-annotate documents from this target language and then train a resolver on these annotated documents using supervised learning techniques. However, the high cost associated with manually coreference-annotating documents needed by a supervised approach makes it(More)
Traditional learning-based coreference resolvers operate by training the mention-pair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mention-pair model is linguistically rather unappealing and lags far behind the heuristic-based coreference models proposed in the pre-statistical NLP(More)
We examine the task of resolving complex cases of definite pronouns, specifically those for which traditional linguistic constraints on coreference (e.g., Binding Constraints, gender and number agreement) as well as commonly-used resolution heuristics (e.g., string-matching facilities, syntactic salience) are not useful. Being able to solve this task has(More)
Named entity recognition for morphologically rich, case-insensitive languages, including the majority of semitic languages, Iranian languages, and Indian languages, is inherently more difficult than its English counterpart. Worse still, progress on machine learning approaches to named entity recognition for many of these languages is currently hampered by(More)
While information status (IS) plays a crucial role in discourse processing, there have only been a handful of attempts to automatically determine the IS of discourse entities. We examine a related but more challenging task, fine-grained IS determination, which involves classifying a discourse entity as one of 16 IS subtypes. We investigate the use of rich(More)