Learn More
We propose in this paper a new family of kernels to handle times series, notably speech data, within the framework of kernel methods which includes popular algorithms such as the Support Vector Machine. These kernels elaborate on the well known Dynamic Time Warping (DTW) family of distances by considering the same set of elementary operations, namely(More)
This paper describes an overview of the IR for Spoken Documents Task in NTCIR-9 Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken document retrieval subtask (SDR) are conducted. Both of the subtasks target to search terms, passages and documents included in academic and simulated lectures of the Corpus of Spontaneous(More)
—One significant problem for spoken language systems is how to cope with users' out-of-domain (OOD) utterances which cannot be handled by the back-end application system. In this paper, we propose a novel OOD detection framework, which makes use of the classification confidence scores of multiple topics and applies a linear discriminant model to perform(More)
This paper reports our experiments on the concept detection task of TRECVID 2007. In these experiments, we have addressed two approaches which are selecting and fusing features and kernel-based learning method. As for the former one, we investigate the following issues: (i) which features are more appropriate for the concept detection task?, (ii) whether(More)
This paper introduces a Japanese spontaneous speech database of 3,771 speakers with wide regional and age distributions. This database is designed to capture Japanese spontaneous speech characteristics and is used to develop a speaker independent (SI) speech recognition system. This paper describes the data collection and transcription. Moreover, we show(More)
In this paper, we propose a novel semi-supervised speaker identification method that can alleviate the influence of non-stationarity such as session dependent variation , the recording environment change, and physical conditions/emotions. We assume that the voice quality variants follow the covariate shift model, where only the voice feature distribution(More)