Learn More
Individualism and collectivism refer to cultural values that influence how people construe themselves and their relation to the world. Individualists perceive themselves as stable entities, autonomous from other people and their environment, while collectivists view themselves as dynamic entities, continually defined by their social context and(More)
This paper describes a novel approach to flexible control of speaker characteristics using tensor representation of speaker space. In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice Gaussian mixture model (EV-GMM)(More)
In this paper, we prove that the direction of cepstrum vectors strongly depends on vocal tract length and that this dependency is represented as rotation in the n dimensional cepstrum space. In speech recognition studies, vocal tract length normalization (VTLN) techniques are widely used to cancel age-and gender-differences. In VTLN, a frequency warping is(More)
This paper proposes a new framework of speech generation by imitating " infants' vocal imitation ". Most of the speech synthesizers take a phoneme sequence as input and generate speech by converting each of the phonemes into a sound sequentially. In other words, they simulate a human process of reading text out. However, infants usually acquire speech(More)
This paper describes a novel approach to voice conversion using both a joint density model and a speaker model. In voice conversion studies, approaches based on Gaussian Mixture Model (GMM) with probabilistic densities of joint vectors of a source and a target speakers are widely used to estimate a transformation. However, for sufficient quality, they(More)
This paper presents the first version of a speaker verification spoof-ing and anti-spoofing database, named SAS corpus. The corpus includes nine spoofing techniques, two of which are speech synthesis, and seven are voice conversion. We design two protocols, one for standard speaker verification evaluation, and the other for producing spoofing materials.(More)
Acoustic event detection systems supporting heterogeneous sets of events face the problem of having to characterize them when they have different acoustic properties (transient, stationary, both, etc.), observing this fact even within the acoustic event itself. Moreover, managing large feature vectors with features characterizing different properties of the(More)
This paper proposes a stochastic model of speech F 0 contours , based on the stochastic formulation of the Fujisaki model. Our motivation for the stochastic formulation is twofold. Firstly, it allows us to derive a well-behaved algorithm for estimating the Fujisaki model parameters from a raw F 0 contour. Secondly, it will open the door to incorporating the(More)