Daniel Aalto

Learn More
The pitch contour in speech contains information about different linguistic units at several distinct temporal scales. At the finest level, the microprosodic cues are purely segmental in nature, whereas in the coarser time scales, lexical tones, word accents, and phrase accents appear with both linguistic and paralinguistic functions. Consequently, the(More)
Spatial data of the human vocal tract (VT), larynx, and thorax can be obtained by magnetic resonance imaging (MRI) during steady, sustained phonation. Long acquisition time increases the resolution as well as the errors due to involuntary motion of VT. We discuss two experiments with a single test subject to find a suitable 3D MRI data acquisition procedure(More)
The human vocal folds are known to interact with the vocal tract acoustics during voiced speech production; namely a nonlinear sourcefilter coupling has been observed both by using models and in in vivo phonation. These phenomena are approached from two directions in this article. We first present a computational dynamical model of the speech apparatus that(More)
This article describes an arrangement for simultaneous recording of speech and the geometry of vocal tract. Experimental design is considered from the phonetics point of view. The speech signal is recorded with an acoustic-electrical arrangement and the vocal tract with MRI. Finally, data from pilot measurements on vowels is presented, and its quality is(More)
Many languages exploit suprasegmental devices in signaling word meaning. Tone languages exploit fundamental frequency whereas quantity languages rely on segmental durations to distinguish otherwise similar words. Traditionally, duration and tone have been taken as mutually exclusive. However, some evidence suggests that, in addition to durational cues,(More)
We compare numerically computed resonances of the human vocal tract with formants that have been extracted from speech during vowel pronunciation. The geometry of the vocal tract has been obtained by MRI from a male subject, and the corresponding speech has been recorded simultaneously. The resonances are computed by solving the Helmholtz partial(More)
Speech in noise, or Lombard speech, is characterized by increased intensity and higher fundamental frequency as well as lengthened segmental durations as speakers try to maintain a beneficial signal-to-noise ratio to fill both communicative and self-monitoring requirements. The phenomenon has been studied with regard to different noise types and different(More)
Discrete phonological phenomena form our conscious experience of language: continuous changes in pitch appear as distinct tones to the speakers of tone languages, whereas the speakers of quantity languages experience duration categorically. The categorical nature of our linguistic experience is directly reflected in the traditionally clear-cut linguistic(More)
The fundamental frequency of a complex sound modulates the perceived duration of a sound. Higher pitch sounds are perceived longer compared to lower pitch sounds as shown by several independent studies since 1973. In this paper, the effect of language background is studied: native speakers of Finnish and German participated in a two alternative forced(More)
We present anatomic and acoustic data from a pilot study on the Finnish vowels [A, e, i, o, u, y, æ, ø]. The data were acquired simultaneously with 3D magnetic resonance imaging (MRI) and a custom built sound recording system. The data consist of a single static repetition of each vowel with constant f0. The imaging sequence was 7.6 s long and had an(More)