Learn More
TimeML, TimeBank, and TTK (TARSQI Project) have been playing an important role in enhancement of IE, QA, and other NLP applications. TimeML is a specification language for events and temporal expressions in text. This paper presents the problems and solutions for porting TimeML to Korean as a part of the Korean TARSQI Project. We also introduce the KTTK(More)
In this paper we discuss research designed to investigate the ability of users to find information in texts written in languages unknown to them. One study shows how document thumbnail visualizations can be used effectively to choose potentially relevant documents. Another study shows how a user of a cross-language text retrieval system who has no foreign(More)
In this paper, we propose a multi-strategic matching and merging approach to find correspondences between ontologies based on the syntactic or semantic characteristics and constraints of the Topic Maps. Our multi-strategic matching approach consists of a linguistic module and a Topic Map constraints-based module. A linguistic module computes similarities(More)
Tagging is one of the most popular services in Web 2.0 and folksonomy is a representation of collaborative tagging. Tag cloud has been the one and only visualization of the folksonomy. The tag cloud, however, provides no information about the relations between tags. In this paper, targeting del.icio.us tag data, we propose a technique, Folk-soViz, for(More)
This study aims at retrieving tweets with an implicit topic, which cannot be identified by the current query-matching system employed by Twitter. Such tweets are relevant to a given query but do not explicitly contain the term. When these tweets are combined with a relevant tweet containing the overt keyword, the “serialized” tweets can be integrated into(More)
Tagging is one of the most popular services in Web 2.0. As a special form of tagging, social tagging is done collaboratively by many users, which forms a so-called folksonomy. As tagging has become widespread on the Web, the tag vocabulary is now very informal, uncontrolled, and personalized. For this reason, many tags are unfamiliar and ambiguous to users(More)
Recently, the number of articles, blog posts, photos and videos on the web is dramatically increasing because of the increase of internet usage. In this situation, the web search is the most important thing in the web. When we search, we can use text information from articles or blog posts. In the case of photos and videos, we can only use a title. If there(More)
This paper proposes a keyword extraction process, based on the PageRank algorithm, to reduce noise of input data for measuring semantic similarity. This paper will introduce several features related to implementation and discuss their effects. It will also discuss experimental results which showed significantly improved document retrieval performance with(More)
We propose a method for dealing with semantic complexities occurring in information retrieval systems on the basis of linguistic observations. Our method follows from an analysis indicating that long runs of content words appear in a stopped document cluster, and our observation that these long runs predominately originate from the prepositional phrase and(More)