Tanaka Speech Synthesis Using a Neural Network with a Cooperative Learning Mechanism
- M Komura
- IEICE Tech. Rep
We propose a new neural network model and its learning algorithm. The proposed neural network consists of four layers input, hidden, output and final output layers. The hidden and output layers are multiple. Using the proposed SICL(Spread Pattern Information and Cooperative Learning) algorithm, it is possible to learn analog data accurately and to obtain smooth outputs. Using this neural network, we have developed a speech production system consisting of a phonemic symbol production subsystem and a speech parameter production subsystem. We have succeeded in producing natural speech waves with high accuracy. INTRODUCTION Our purpose is to produce natural speech waves. In general, speech synthesis by rule is used for producing speech waves. However, there are some difficulties in speech synthesis by rule. First, the rules are very complicated. Second, extracting a generalized rule is difficult. Therefore, it is hard to synthesize a natural speech wave by using rules. We use a neural network for producing speech waves. Using a neural network, it is possible to learn speech parameters without rules. (Instead of describing rules explicitly, selecting a training data set becomes an important subject.) In this paper, we propose a new neural network model and its learning algorithm. Using the proposed neural network, it is possible to learn and produce analog data accurately. We apply the network to a speech production system and examine the system performance. PROPOSED NEURAL NETWORK AND ITS LEARNING ALGORITHM We use an analog neuron-like element in a neural network. The element has a logistic activation function presented by equation (3). As a learning algorithm, Speech Production Using A Neural Network 233 the BP(Back Propagation) method is widely used. By using this method it is possible to learn the weighting coefficients of the units whose target values are not given directly. However, there are disadvantages. First, there are singular points at 0 and 1 (outputs of the neuron-like element). Second, finding the optimum values of learning constants is not easy. We have proposed a new neural network model and its learning algorithm to solve this problem. The proposed SICL(Spread Pattern Information and Cooperative Learning) method has the following features. (a)The singular points of the BP method are removed. (Outputs are not simply o or 1.) This improves the convergence rate. (b)A spread pattern information(SI) learning algorithm is proposed. In the SI learning algorithm, the weighting coefficients from the hidden layers to the output layers are fixed to random values. Pattern information is spread over the memory space of the weighting coefficients. As a result, the network can learn analog data accurately. (c)A cooperative learning(CL) algorithm is proposed. This algorithm makes it possible to obtain smooth and stable output. The CL system is shown in Fig.1 where D(L) is a delay line which delays L time units. In the following sections, we define a three-layer network, introduce the BP method, and propose the SICL method .