Learn More
This paper describes an overview of the IR for Spoken Documents Task in NTCIR-9 Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken document retrieval subtask (SDR) are conducted. Both of the subtasks target to search terms, passages and documents included in academic and simulated lectures of the Corpus of Spontaneous(More)
In this paper, we investigate the answer type detection methods for realizing the Universal Question Answering (UQA), which returns an answer for any given question. For this purpose, the questions collected from a WWW question portal community site were analyzed to see how many kinds of questions were submitted in the real world. Then, we introduce the(More)
This paper presents an overview of the Spoken Query and Spoken Document retrieval (SpokenQuery&Doc-2) task at the NTCIR-12 Workshop. This task included spoken query driven spoken content retrieval (SQ-SCR) and a spoken query driven spoken term detection (SQ-STD) as the two sub-tasks. The paper describes details of each sub-task, the data used, the creation(More)
In this paper, we propose a novel indexing method for Spoken Term Detection (STD). The proposed method can be considered as using metric space indexing for the approximate stringmatching problem, where the distance between a phoneme and a position in the target spoken document is defined. The proposed method does not require the use of thresholds to limit(More)
Spoken Document Retrieval (SDR) and Spoken Term Detection (STD) have been two of the most intensively investigated topics in spoken document processing research according to the establishment of the SDR and STD test collections by the Text REtrieval Conference (TREC) and NIST. Because Japanese spoken document processing researchers also requires such test(More)
This paper describes an overview of the IR for Spoken Documents Task in NTCIR-10 Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken content retrieval subtask (SCR) are conducted. Both of the tasks target to search terms, passages and documents included in academic oral presentations. This paper explains the data used in the(More)
In this paper, we propose two new methods targeting NTCIR-4 QAC2. First, we use knowledge resembling " common sense " for question answering purposes. For example, the length of a runway in an airport must be a few kilometers, but a few centimeters. In practice, we use specific types of information latent in document collections to verify the correctness of(More)
One of the major challenges for spoken document retrieval is how to handle speech recognition errors within the target documents. Query expansion is promising for this challenge. In this paper, we apply relevance models, a type of query expansion method, for the spoken document passage retrieval task. We adapted the original relevance model for passage(More)