Seiichi Yamamoto

Learn More
Abstract At ATR Spoken Language Translation Research Laboratories, we are building a broad-coverage bilingual corpus to study corpus-based speech translation technologies for the real world. There are three important points to consider in designing and constructing a corpus for future speech translation research. The first is to have a variety of speech(More)
This paper presents three approaches to creating corpora that we are working on for speech-to-speech translation in the travel conversation task. The first approach is to collect sentences that bilingual travel experts consider useful for people goingto/coming-from another country. The resulting EnglishJapanese aligned corpora are collectively called the(More)
This paper proposes the automatic generation of Fill-in-the-Blank Questions (FBQs) together with testing based on Item Response Theory (IRT) to measure English proficiency. First, the proposal generates an FBQ from a given sentence in English. The position of a blank in the sentence is determined, and the word at that position is considered as the correct(More)
We have built a new speech translation system called ATR-MATRIX (ATR's Multilingual Automatic Translation System for Information Exchange). This system can recognize natural Japanese utterances such as those used in daily life, translate them into English and output synthesized speech. This system is running on a workstation or a high-end PC and achieves(More)
This paper investigates issues in preparing corpora for developing speech-to-speech translation (S2ST). It is impractical to create a broad-coverage parallel corpus only from dialog speech. An alternative approach is to have bilingual experts write conversational-style texts in the target domain, with translations. There is, however, a risk of losing(More)
Eye gaze is an important means for controlling interaction and coordinating the participants' turns smoothly. We have studied how eye gaze correlates with spoken interaction and especially focused on the combined effect of the speech signal and gazing to predict turn taking possibilities. It is well known that mutual gaze is important in the coordination of(More)
Hierarchical phrase-based machine translation can capture global reordering with synchronous context-free grammar, but has little ability to evaluate the correctness of word orderings during decoding. We propose a method to integrate word-based reordering model into hierarchical phrasebased machine translation to overcome this weakness. Our approach extends(More)
In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis.(More)
This paper proposes a method for designing a sentence set for utterances taking account of prosody. This method is based on a measure of coverage which incorporates two factors: (1) the distribution of voice fundamental frequency and phoneme duration predicted by the prosody generation module of a TTS; (2) perceptual damage to naturalness due to prosody(More)
Spoken interactions are known for accurate timing and alignment between interlocutors: turn-taking and topic flow are managed in a manner that provides conversational fluency and smooth progress of the task. This paper studies the relation between the interlocutors’ eye-gaze and spoken utterances, and describes our experiments on turn alignment. We(More)