• Publications
  • Influence
Julius - an open source real-time large vocabulary recognition engine
EUROSPEECH2001: the 7th European Conference on Speech Communication and Technology, September 3-7, 2001, Aalborg, Denmark.
Recent Development of Open-Source Speech Recognition Engine Julius
TLDR
An overview of Julius, major features and specifications are described, and the developments conducted in the recent years are summarized.
Overview of the IR for Spoken Documents Task in NTCIR-9 Workshop
TLDR
This paper explains the data used in the subtasks, how to make transcriptions by speech recognition and the details of each subtask of the IR for Spoken Documents Task in NTCIR-9 Workshop.
Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
TLDR
This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech that outperformed the conventional DNN-based method in unseen noisy environments.
Benchmark test for speech recognition using the Corpus of Spontaneous Japanese
We present benchmark results of automatic speech recognition using the Corpus of Spontaneous Japanese (CSJ), which has been developed in the five-year national project and will be the largest
Designing Precise and Robust Dialogue Response Evaluators
TLDR
This work proposes to build a reference-free evaluator and exploit the power of semi-supervised training and pretrained (masked) language models and achieves a strong correlation with human judgement and generalizes robustly to diverse responses and corpora.
An Unsupervised Model for Joint Phrase Alignment and Extraction
TLDR
An unsupervised model for joint phrase alignment and extraction using non-parametric Bayesian methods and inversion transduction grammars (ITGs) is presented, which matches the accuracy of traditional two-step word alignment/phrase extraction approach while reducing the phrase table to a fraction of the original size.
Bayesian Learning of a Language Model from Continuous Speech
TLDR
Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates, and the proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.
ERICA: The ERATO Intelligent Conversational Android
TLDR
An overview of the requirements and design of the platform, the development process of an interactive application, report on ERICA's first autonomous public demonstration, and discuss the main technical challenges that remain to be addressed in order to create humanlike, autonomous androids are presented.
Free software toolkit for Japanese large vocabulary continuous speech recognition
ICSLP2000: the 6th International Conference on Spoken Language Processing, October 16-20, 2000, Beijing, China.
...
...