General-to-Specific Model Selection for Subcategorization Preference

Abstract

This paper proposes a novel method for learning probability models of subcategorization preference of verbs. We consider the issues of case dependencies and noun class generalization in a uniform way by employing the maximum entropy modeling method. We also propose a new model selection algorithm which starts from the most general model and gradually examines more specific models. In the experimental evaluation, it is shown that both of the case dependencies and specific sense restriction selected by the proposed method contribute to improving the performance in subcategorization preference resolution. 1 I n t r o d u c t i o n In empirical approaches to parsing, lexical/semantic collocation extracted from corpus has been proved to be quite useful for ranking parses in syntactic analysis. For example, Magerman (1995), Collins (1996), and Charniak (1997) proposed statistical parsing models which incorporated lexical/semantic information. In their models, syntactic and lexical/semantic features are dependent on each other and are combined together. This paper also proposes a method of utilizing lexical/semantic features for the purpose of applying them to ranking parses in syntactic analysis. However, unlike the models of Magerman (1995), Collins (1996), and Charniak (1997), we assume that syntactic and lexical/semantic features are independent. Then, we focus on extracting lexical/semantic collocational knowledge of verbs which is useful in syntactic analysis. More specifically, we propose a novel method for learning a probability model of subcategorization preference of verbs. In general, when learning lexical/semantic collocational knowledge of verbs from corpus, it is necessary to consider the two issues of 1) case dependencies, and 2) noun class generalization. When considering 1), we have to decide which cases are dependent on each other and which cases are optional and in* This research was partially supported by the Ministry of Education, Science, Sports and Culture, Japan, Grantin-Aid for Encouragement of Young Scientists, 09780338, 1998. An extended version of this paper is available from the above URL. dependent of other cases. When considering 2), we have to decide which superordinate class generates each observed leaf class in the verb-noun collocation. So far, there exist several works which worked on these two issues in learning collocational knowledge of verbs and also evaluated the results in terms of syntactic disambiguation. Resnik (1993) and Li and Abe (1995) studied how to find an optimal abstraction level of an argument noun in a tree-structured thesaurus. Their works are limited to only one argument. Li and Abe (1996) also studied a method for learning dependencies between case slots and reported that dependencies were discovered only at the slotlevel and not at the class-level. Compared with these previous works, this paper proposes to consider the above two issues in a uniform way. First, we introduce a model of generating a collocation of a verb and argument /adjunct nouns (section 2) and then view the model as a probability model (section 3). As a model learning method, we adopt the maximum entropy model learning method (Della Pietra et al., 1997; Berger et al., 1996). Case dependencies and noun class generalization are represented as features in the maximum entropy approach. Features are allowed to have overlap and this is quite advantageous when we consider case dependencies and noun class generalization in parameter estimation. An optimal model is selected by searching for an optimal set of features, i.e, optimal case dependencies and optimal noun class generalization levels. As the feature selection process, this paper proposes a new feature selection algorithm which starts from the most general model and gradually examines more specific models (section 4). As the model evaluation criterion during the model search from general to specific ones, we employ the description length of the model and guide the search process so as to minimize the description length (Rissanen, 1984). Then, after obtaining a sequence of subcategorization preference models which are totally ordered from general to specific, we select an approximately optimal subcategorization preference model according to the accuracy of subcategorization preference test. In the experimental evaluation of performance of subcatego-

Extracted Key Phrases

1 Figure or Table

Cite this paper

@inproceedings{Utsuro1998GeneraltoSpecificMS, title={General-to-Specific Model Selection for Subcategorization Preference}, author={Takehito Utsuro and Takashi Miyata and Yuji Matsumoto}, booktitle={COLING-ACL}, year={1998} }