Learn More
Past work of generating referring expressions mainly utilized attributes of objects and binary relations between objects. However, such an approach does not work well when there is no distinctive attribute among objects. To overcome this limitation, this paper proposes a method utilizing the perceptual groups of objects and n-ary relations among them. The(More)
This paper reports prototype multilingual query expansion system relying on LMF compliant lexical resources. The system is one of the deliverables of a three-year project aiming at establishing an international standard for language resources which is applicable to Asian languages. Our important contributions to ISO 24613, standard Lexical Markup Framework(More)
Linguistic knowledge plays a crucial role in natural language processing. Constructing large linguistic knowledge bases requires a lot of human effort and much cost. There have been many attempts to construct linguistic knowledge automatically , based on two primary strategies: knowledge extraction from annotated corpora and the augmentation of existing(More)
Parsing is one of the important processes for natural language processing and, in general, a large-scale CFG is used to parse a wide variety of sentences. For many languages, a CFG is derived from a large-scale syntactically annotated corpus, and many parsing algorithms using CFGs have been proposed. However, we could not apply them to Japanese since a(More)
Automatic query expansion has been known to be the most important method in overcoming the word mismatch problem in information retrieval. Thesauri have long been used by many researchers as a tool for query expansion. However only one type of thesaurus has generally been used. In this paper we analyze the characteristics of dierent thesaurus types and(More)
This paper proposes an efficient example sampling method for example-based word sense disam-biguation systems. To construct a database of practical size, a considerable overhead for manual sense disambiguation (overhead for supervision) is required. In addition, the time complexity of searching a large-sized database poses a considerable problem (overhead(More)
c The author(s) of this report reserves all the rights. Abstract We propose a method to build thesauri on the basis of grammatical relations. The proposed method constructs thesauri by using a hierarchical clustering algorithm. An important point in this paper is the claim that thesauri in order to be ecient need to take (surface) case information into(More)
Text classification, the grouping of texts into several clusters, has been used as a means of improving both the efficiency and the effective-Dess of text retrieval/categorization In this paper we propose a hierarchical clustering algorithm that constructs a Bet of clusters having the maximum Bayesian posterior probability, the probability that the given(More)
Analyzing compound nouns is one of the crucial issues for natural language processing systems , in particular for those systems that aim at a wide coverage of domaius. In this paper, we propose a mcthod to analyze structures of Japanese compound nouns by using both word collocations statistics and a thesaurus. An experiment is conducted with 160,000 word(More)