Learn More
This paper describes a method of extracting katakana words and phrases, along with their English counterparts from non-aligned monolingual web search engine query logs. The method employs a trainable edit distance function to find <katakana, English> pairs that have a high probability of being equivalent. These pairs can then be used to further bootstrap(More)
This paper explores techniques for reducing the effectiveness of standard authorship attribution techniques so that an author A can preserve anonymity for a particular document D. We discuss feature selection and adjustment and show how this information can be fed back to the author to create a new document D' for which the calculated attribution moves away(More)
We will demonstrate MindNet, a lexical resource built automatically by processing text. We will present two forms of MindNet: as a static lexical resource, and, as a toolkit which allows MindNets to be built from arbitrary text. We will also introduce a web-based interface to MindNet lexicons (MNEX) that is intended to make the data contained within(More)
In this paper, we give a general description of the issues associated with performing basic Question-Answering (QA) tasks against non-player characters (NPCs) within a simple role-playing game (RPG) or virtual world environment. We describe the aspects of this kind of QA system and provide an overview of our initial explorations into the implementation and(More)
This paper describes our work motivating a group of students (grades 5-8) to learn real-world computer programming by introducing them to homebrew development for the Nintendo Gameboy Advance (GBA) and DS (NDS) systems using C. Students use a freely available professional toolchain (devkitPro) for development. A custom application was written that allowed(More)
1 In this document, we describe our work applying natural language (NL) technologies to improve non-player character (NPC) dialog interactions in games, specifically role-playing games (RPGs). Our approach is to adapt the standard dialog menu interaction so that the menu items are dynamically-generated during game runtime rather than scripted during(More)
We describe a method of word segmentation in Japanese in which a broad-coverage parser selects the best word sequence while producing a syntactic analysis. This technique is substantially different from traditional statistics-or heuristics-based models which attempt to select the best word sequence before handing it to the syntactic component. By breaking(More)
We describe a segmentation component that utilizes minimal syntactic knowledge to produce a lattice of word candidates for a broad coverage Japanese NL parser. The segmenter is a finite state morphological analyzer and text normalizer designed to handle the orthographic variations characteristic of written Japanese, including alternate spellings, script(More)
This paper describes a comprehensive system for essential insight retrieval out of web texts, using style factors for profile and tone classification. Our goal in this paper, beyond demonstrating the actual system, is to bring up the notion that having the ability to determine a document's writing style and extract its stylistic factors is insufficient(More)