Shinichi Homma

Learn More
There is a great need for more TV programs to be closed-captioned to help hearing impaired and elderly people watch TV. For that purpose, automatic speech recognition is expected to contribute to providing text from speech in real-time. NHK has been using speech recognition for closed-captioning of some of its news, sports and other live TV programs. In(More)
There is a great need for more TV programs to be subtitled to help hearing impaired and elderly people to watch TV. NHK has researched automatic speech recognition for subtitling live TV programs in real time efficiently. Our speech recognition system learns frequent words and expressions expected in the program beforehand and also learns characteristics of(More)
A new real-time closed-captioning system for Japanese broadcast news programs is described. The system is based on a hybrid automatic speech recognition system that switches input speech between the original program sound and the rephrased speech by a " re-speaker ". It minimises the number of correction operators, generally to one or two, depending on the(More)
It is desirable to consistently and seamlessly update a language model of speech recognition without stopping it for online applications such as real-time closed-captioning. This paper proposes a novel speech recognition system that enables the model to be updated at any time even while it is running. It can run the second decoder with the latest model in(More)
This paper describes a lattice-based risk minimization training method for unsupervised language model (LM) adaptation. In a broadcast archiving system, unsupervised LM adaptation using transcriptions generated by speech recognition is considered to be useful for improving the performance. However, conventional linear interpolation methods occasionally(More)
This paper describes a new criterion of speech recognition using an integrated confidence measure for minimization of the word error rate (WER). Conventional criteria for WER minimization obtain an expected WER of a sentence hypothesis merely by comparing it with other hypotheses in an n-best list. The proposed criterion estimates the expected WER by using(More)
We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speaking style. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends(More)