Doo Soon Kim

Learn More
KLEO is a bootstrapping learning-by-reading system that builds a knowledge base in a fully automated way by reading texts for a domain. KLEO’s initial knowledge base is a small knowledge base that consists of domain independent knowledge and KLEO expands the knowledge base with the information extracted from texts. A key facility in KLEO is knowledge(More)
In this paper, we present an approach that jointly infers the boundaries of tokens and their labels to construct dictionaries for Information Extraction. Our approach for joint-inference is based on graph propagation, and extends it in two novel ways. First, we extend the graph representation to capture ambiguities that occur during the token extraction(More)
Short listings such as classified ads or product listings abound on the web. If a computer can reliably extract information from them, it will greatly benefit a variety of applications. Short listings are, however, challenging to process due to their informal styles. In this paper, we present an unsupervised information extraction system for short listings.(More)
A traditional goal of Artificial Intelligence research has been a system that can read unrestricted natural language texts on a given topic, build a model of that topic and reason over the model. Natural Language Processing advances in syntax and semantics have made it possible to extract a limited form of meaning from sentences. Knowledge Representation(More)
We previously proposed a packed graphical representation to succinctly represent a huge number of alternative semantic representations of a given sentence. We also showed that this representation could improve text interpretation accuracy considerably because the system could postpone resolving ambiguity until more evidence accumulates. This paper discusses(More)
Text Understanding systems often commit to a single best interpretation of a sentence before analyzing subsequent text. This interpretation is chosen by resolving ambiguous alternatives to the one with the highest confidence, given the context available at the time of commitment. Subsequent text, however, may contain information that changes the confidence(More)