Learn More
This paper presents the recent development of the HTK broadcast news transcription system. Previously we have used data type specific modelling based on adapted Wall Street Journal trained HMMs. However, we are now experimenting with data for which no manual pre-classification or segmentation is available and therefore automatic techniques are required and(More)
This paper reports on the correlation between word confusion matrices from Word-Error-Rate (WER) experiments and different phonetic distance measures. The investigated phonetic distance measures are based on the minimum-edit-distances between phonetic transcriptions and the distances between Hidden-Markov-Models (HMM). We show that phonetic distance(More)
This thesis investigates the iterative application of Monte Carlo methods to the problem of parameter estimation for models of maximum entropy, minimum divergence, and maximum likelihood among the class of exponential-family densities. It describes a suite of tools for applying such models to large domains in which exact computation is not practically(More)
It is well known 5] that variations in speaking rate can account for a signiicant percentage of errors in practical speech recognition tasks. This is the result of the dynamic nature of speech which is not modelled properly by the standard HMM structure. This paper proposes an extension to the standard HMM that takes advantage of the information about the(More)
This paper describes a number of recent improvements to the HTK Broadcast News Transcription System. Changes to the system include the use of more acoustic training data; use of cluster-based variance normalisation and vocal tract length normalisation; the use of interpolated language models and enhanced adaptation using a full variance transform. These(More)
The CU-MDR Demo [3] is a web based application that allows the user to query a database of automatically generated transcripts of radio broadcasts that are available on-line. The system downloads the audio track of British and American news broadcasts from the Internet once a day and adds them to its archive. The audio, which comes in RealAudio format, is(More)
  • Andreas Tuerk
  • 2008
This paper develops implicit softmax transforms (IST) which are dimensionality reducing transforms that are defined implicitly by minimisation of a weighted sum of Kullback-Leib- ler distances (WKL). The parameters of an IST are trained in combination with the parameters of the polynomial exponents of softmax functions. The resulting gradient of the WKL can(More)
This paper investigates the problem of inserting an additional hidden variable into a standard HMM. It is shown that this can be done by introducing a continuous feature which is used to calculate the probability of observing the different states of the hidden variable. The posteriors are modelled by softmax functions with polynomial exponents and an(More)