Inferring parts of speech for lexical mappings via the Cyc KB

  title={Inferring parts of speech for lexical mappings via the Cyc KB},
  author={Tom O'Hara and Stefano Bertolo and M. Witbrock and Bj{\o}rn Aldag and Jon Curtis and Kathy Panton and David Schneider and Nancy Salay},
We present an automatic approach to learning criteria for classifying the parts-of-speech used in lexical mappings. This will further automate our knowledge acquisition system for non-technical users. The criteria for the speech parts are based on the types of the denoted terms along with morphological and corpus-based clues. Associations among these and the parts-of-speech are learned using the lexical mappings contained in the Cyc knowledge base as training data. With over 30 speech parts to… 

Figures and Tables from this paper

Emotionally Driven Natural Language Generation for Personality Rich Characters in Interactive Games
A novel template-based system that provides two distinct advantages over existing systems that enables a character's personality and emotional state to influence the feel of the utterance, thus decreasing the burden on the game author.
Ontology-driven Generation of 3D Animations for Training and Maintenance
The role of the ontology is to reduce the overall complexity of the animation authoring process by assuring the necessary comprehension of the training requests as well as reusability and extensibility of the structure of both modeled objects and of animations' components in different fields of knowledge.
Using Ontology to create 3D Animations for Training Purposes
Role of the ontology is to reduce the overall complexity of the animation authoring process by assuring the necessary comprehension of customized training requests as well as reusability and extensibility of the structure of the modeled object and of animations' components in different domains.


Inducing criteria for mass noun lexical mappings using the Cyc KB, and its extension to WordNet
This paper presents an automatic approach for learning semantic criteria for the mass versus count noun distinction by induction over the lexical mappings contained in the Cyc knowledge base, preserving the general accuracy and broader applicability.
Corpus-based acquisition of head noun countability features
This thesis presents a method of automatically acquiring countability properties of head nouns from a part-of-speech tagged corpus, specifically the British National Corpus, and demonstrates that the method used is both grammatically sound and successful, showing an improvement over the baseline.
A lexicon for knowledge-based MT
In knowledge-based machine translation (KBMT), the lexicon can be specified and acquired only in close connection with the specification and acquisition of the world model (ontology) and the
Decomposable Modeling in Natural Language Processing
A framework for developing probabilistic classifiers in natural language processing by formulating models that capture the most important interdependencies among features, to avoid overfitting the data while also characterizing the data well is described.
Building and Maintaining a Semantically Adequate Lexicon Using Cyc
A number of issues which have arisen in developing the Cyc lexicon are discussed, including the feasibility of taking advantage of semantic classification schemes as a shortcut to hand-entering lexical information; distinguishing between world and lexical knowledge; and providing a level of semantic detail in the lexicon which is compatible with the expressiveness of Cyc’s internal representation language.
Learning the Countability of English Nouns from Corpus Data
The method maps the corpus-attested lexico-syntactic properties of each noun onto a feature vector, and uses a suite of memory-based classifiers to predict membership in 4 countability classes.
Aggressive Morphology for Robust Lexical Coverage
A system of approximately 1200 morphological rules is used to extend a core lexicon to provide lexical coverage that exceeds that of a lexicon of 80,000 words or 150,000 word forms.
Using an Ontology to Determine English Countability
It is found that at 78% of nouns' countability could be predicted using an ontology of 2,710 nodes, which can be used to aid non-native speakers to determine the countability of English nouns when building a bilingual machine translation lexicon.
Mass Terms and Model-Theoretic Semantics
An extension of classical set theory, Ensemble Theory, is defined and this provides the conceptual basis of a framework for the analysis of natural language meaning which Dr Bunt calls Two-level model-theoretic semantics.
Combining Distributional and Morphological Information for Part of Speech Induction
Algorithms for clustering words into classes from unlabelled text using unsupervised algorithms, based on distributional and morphological information, are discussed, showing how the use of morphological Information can improve the performance on rare words.