Data Set Used
s In TREC-10, we participated in the web track (only ad-hoc task) and the QA track (only main task). In the QA track, our QA system (SiteQ) has general architecture with three processing steps: question processing, passage selection and answer processing. The key technique is LSP's (Lexico-Semantic Patterns) that are composed of linguistic entries and… (More)
In this paper, we describe a named entity recognition using a modified Pegasos algorithm for structural SVMs. We show the modified Pegasos algorithm significantly outperformed CRFs and the training time for the modified Pegasos algorithm is reduced 17-26 times compared to CRFs.
This paper presents the automatic construction of a Korean WordNet from pre-existing lexical resources. A set of automatic WSD techniques is described for linking Korean words collected from a bilingual MRD to English WordNet synsets. We will show how individual linking provided by each WSD method is then combined to produce a Korean WordNet for nouns. 1… (More)
In many QA systems, fine-grained named entities are extracted by coarse-grained named entity recognizer and fine-grained named entity dictionary. In this paper, we describe a fine-grained Named Entity Recognition using Conditional Random Fields (CRFs) for question answering. We used CRFs to detect boundary of named entities and Maximum Entropy (ME) to… (More)
Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting a structural index system using… (More)
This paper presents an Information Extraction (IE) approach for spoken language understanding. The goal in IE is to find proper values for pre-defined slots of given templates. IE for spoken language understanding proposes a concept spotting approach for spoken language because IE approach is interested in only pre-defined concept slots. In spite of this… (More)
In this paper, we present an information extraction system that extracts template elements for a question-answering (QA) system in the domain of encyclopedia. We use Conditional Random Fields to extract templates from the texts of an encyclopedia. Using the proposed approach, we could achieve a 74.89% precision and a 55.77% F1 in the template extraction. In… (More)