Shinichi Homma

Learn More
There is a great need for more TV programs to be closed-captioned to help hearing impaired and elderly people watch TV. For that purpose, automatic speech recognition is expected to contribute to providing text from speech in real-time. NHK has been using speech recognition for closed-captioning of some of its news, sports and other live TV programs. In(More)
Some working groups of the IETF and other Standards Developing Organizations are now discussing use cases of a technology that enables data packets to traverse appropriate service functions located remotely through networks. This is called Service Chaining in this document. (Also, in Network Functions Virtualisation (NFV), a subject that forwarding packets(More)
There is a great need for more TV programs to be subtitled to help hearing impaired and elderly people to watch TV. NHK has researched automatic speech recognition for subtitling live TV programs in real time efficiently. Our speech recognition system learns frequent words and expressions expected in the program beforehand and also learns characteristics of(More)
A new real-time closed-captioning system for Japanese broadcast news programs is described. The system is based on a hybrid automatic speech recognition system that switches input speech between the original program sound and the rephrased speech by a " re-speaker ". It minimises the number of correction operators, generally to one or two, depending on the(More)
This paper describes a lattice-based risk minimization training method for unsupervised language model (LM) adaptation. In a broadcast archiving system, unsupervised LM adaptation using transcriptions generated by speech recognition is considered to be useful for improving the performance. However, conventional linear interpolation methods occasionally(More)
Low-latency speaker diarization is desirable for online-oriented speaker adaptation in real-time speech recognition. Especially in spontaneous conversations, several speakers tend to speak alternatively and continuously without any silence in between utterances. We therefore propose a speaker diarization method that detects speaker-change points and(More)