Learn More
s In TREC-10, we participated in the web track (only ad-hoc task) and the QA track (only main task). In the QA track, our QA system (SiteQ) has general architecture with three processing steps: question processing, passage selection and answer processing. The key technique is LSP's (Lexico-Semantic Patterns) that are composed of linguistic entries and(More)
Schema and data conflicts between component databases are a crucial problem in building multidatabase systems. This article presents a comprehensive framework for classifying these conflicts. he proliferation of file systems, navigational database systems (hierarchical and network). and relational database systems during the past three decades has created(More)
In this paper, we present a new method of representing the Surface syntactic structure of a sentence. Trees have usually been used in linguistics and natural language processing to represent syntactic structures of a sentence. A tree structure shows only one possible syntactic parse of a sentence, but in order to choose a correct parse, we need to examine(More)
A wide range of supervised learning algorithms has been applied to Text Categorization. However, the supervised learning approaches have some problems. One of them is that they require a large, often prohibitive, number of labeled training documents for accurate learning. Generally, acquiring class labels for training data is costly, while gathering a large(More)
Named Entity recognition, as a task of providing important semantic information, is a critical first step in Information Extraction and Question-Answering system. This paper proposes a hybrid method of the named entity recognition which combines maximum entropy model, neural network, and pattern-selection rules. The maximum entropy model is used for the(More)
The analysis of a speech act is important for dialogue understanding systems because the speech act of an utterance is closely associated with the user's intention in the utterance. This paper proposes a speech act classification model that effectively uses a two-layer hierarchical structure generated from the adjacency pair information of speech acts. The(More)
Automatic text categorization is a problem of assigning text documents to pre-defined categories. In order to classify text documents, we must extract useful features. In previous researches, a text document is commonly represented by the term frequency and the inverted document frequency of each feature. Since there is a difference between important(More)