Mark Huckvale

Learn More
This paper describes the results of a study of the phonetic and phonological factors affecting the rhythm and timing of spoken Korean. Stepwise construction of a CART model was used to uncover the contribution and relative importance of phrasal, syllabic, and segmental contexts. The model was trained from a corpus of 671 read sentences, yielding 42,000(More)
This paper introduces a new metric for the quantitative assessment of the similarity of speakers' accents. The ACCDIST metric is based on the correlation of inter-segment distance tables across speakers or groups. Basing the metric on segment similarity within a speaker ensures that it is sensitive to the speaker’s pronunciation system rather than to his or(More)
As users are only too aware, contemporary large vocabulary speech recognition systems do not respond to speech in the same way as humans. The dictation systems that are in use today are very sensitive to disfluencies, restarts, background noise and change of speaker or voice quality. Furthermore the recognition mistakes they make seem to be very different(More)
As well as conveying a message in words and sounds, the speech signal carries information about the speaker's own anatomy, physiology, linguistic experience and mental state. These speaker characteristics are found in speech at all levels of description: from the spectral information in the sounds to the choice of words and utterances themselves. This(More)
Speech synthesis research has been transformed in recent years through the exploitation of speech corpora – both for statistical modelling and as a source of signals for concatenative synthesis. This revolution in methodology and the new techniques it brings calls into question the received wisdom that better computer voice output will come from a better(More)
Intonation modelling in ProSynth involves mapping the defining characteristics of an F0 contour on to the constituents of a hierarchical prosodic structure, which constitutes our core linguistic representation. The paper describes the use of a labelled speech database exemplifying selected structures to create a template for a particular pitch pattern in a(More)