Complex Predicates in Indian Language Wordnets

Abstract

Wordnets, which are repositories of lexical semantic knowledge containing semantically linked synsets and lexically linked words, are indispensable for work on computational linguistics and natural language processing. While building wordnets for Hindi and Marathi, two major IndoEuropean languages, we observed that the verb hierarchy in the Princeton Wordnet was rather shallow. We set to constructing a verb knowledge base for Hindi, which arranges the Hindi verbs in a hierarchy of is-a (hypernymy) relation. We realized that there are unique Indian language phenomena that bear upon the lexicalization vs. syntactically derived choice. One such example is the occurrence of conjunct and compound verbs (called Complex Predicates) which are found in all Indian languages. This paper presents our experience in the construction of lexical knowledge bases for Indian languages with special attention to Hindi. The question of storing or deriving complex predicates has been dealt with linguistically and computationally. We have constructed empirical tests to decide if a combination of two words, the second of which is a verb, is a complex predicate or not. Such tests will provide a principled way of deciding the status of complex predicates in Indian language wordnets. An additional application of this work is the possibility of automatic augmentations to the Wordnet using corpora, a topic of great interest in current research.

Extracted Key Phrases

13 Figures and Tables

Cite this paper

@inproceedings{Bhattacharyya2007ComplexPI, title={Complex Predicates in Indian Language Wordnets}, author={Pushpak Bhattacharyya and Debasri Chakrabarti and Vaijayanthi M. Sarma}, year={2007} }