Learn More
Pine wilt disease induced by the pinewood nematode, Bursaphelenchus xylophilus, is a great threat to pine forests in Japan. The first occurrence of the disease was reported in Nagasaki, Kyushu. During the 1930s the disease occurrence was extended in 12 prefectures, and in the 1940s the disease was found in 34 prefectures. The annual loss of pine trees(More)
Simple4All Tundra (version 1.0) is the first release of a standardised multilingual corpus designed for text-to-speech research with imperfect or found data. The corpus consists of approximately 60 hours of speech data from audiobooks in 14 languages, as well as utterance-level alignments obtained with a lightly-supervised process. Future versions of the(More)
During the past 3 yr, nematologists in the United States have found specimens of Bursaphelenchus sp. in the wood of dead and dying pine trees. This nematode-host association resembles a similar interaction reported from Japan where pine trees are being killed by the pine wood nematode. This taxonomic research was conducted to determine if the Japanese pine(More)
Audiobooks have been focused on as promising data for training Text-to-Speech (TTS) systems. However, they usually do not have a correspondence between audio and text data. Moreover, they are usually divided only into chapter units. In practice, we have to make a correspondence of audio and text data before we use them for building TTS synthesisers. However(More)
This paper presents techniques for building text-to-speech front-ends in a way that avoids the need for language-specific expert knowledge, but instead relies on universal resources (such as the Unicode character database) and unsupervised learning from unannotated data to ease system development. The acquisition of expert language-specific knowledge and(More)
This paper describes the ALISA tool, which implements a lightly supervised method for sentence-level alignment of speech with imperfect transcripts. Its intended use is to enable the creation of new speech corpora from a multitude of resources in a language-independent fashion, thus avoiding the need to record or transcribe speech data. The method is(More)
We propose an incremental unsupervised adaptation method based on reinforcement learning in order to achieve robust speech recognition in various noisy environments. Reinforcement learning is a training method based on rewards that represents correctness of outputs instead of supervised data. The training progresses gradually based on rewards given. Our(More)
We describe the synthetic voices entered into the 2013 Blizzard Challenge by the SIMPLE 4 ALL consortium. The 2013 Blizzard Challenge presents an opportunity to test and benchmark some of the tools we have been developing to address two problems of interest: 1) how best to learn from plentiful 'found' data, and 2) how to produce systems in arbitrary new(More)