• Publications
  • Influence
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Expand
The subspace Gaussian mixture model - A structured model for speech recognition
TLDR
A new approach to speech recognition, in which all Hidden Markov Model states share the same Gaussian Mixture Model (GMM) structure with the same number of Gaussians in each state, appears to give better results than a conventional model. Expand
Probabilistic and Bottle-Neck Features for LVCSR of Meetings
TLDR
This work is exploring the possibility of obtaining the features directly from neural net without the necessity of converting output probabilities to features suitable for subsequent GMM-HMM system. Expand
Subspace Gaussian Mixture Models for speech recognition
TLDR
An acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space, and this style of acoustic model allows for a much more compact representation. Expand
Comparison of keyword spotting approaches for informal continuous speech
TLDR
The acoustic and phoneme-lattice based KWS are based on a phoneme recognizer making use of temporal-pattern (TRAP) feature extraction and posterior estimation using neural nets and show its superiority over traditional HMM/GMM systems. Expand
Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006
TLDR
The STBU speaker recognition system was a combination of three main kinds of subsystems, which performed well in the NIST Speaker Recognition Evaluation 2006 (SRE). Expand
Simplification and optimization of i-vector extraction
TLDR
Under certain assumptions, the formulas for i-vector extraction—also used in i- vector extractor training—can be simplified and lead to a faster and memory more efficient code. Expand
Transcribing Meetings With the AMIDA Systems
TLDR
An overview of the AMIDA systems for transcription of conference and lecture room meetings, developed for participation in the Rich Transcription evaluations conducted by the National Institute for Standards and Technology in the years 2007 and 2009 is given. Expand
The language-independent bottleneck features
TLDR
This paper presents novel language-independent bottleneck (BN) feature extraction framework, where each language is modelled by separate output layer, while all the hidden layers jointly model the variability of all the source languages. Expand
Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models
TLDR
This work reports experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages. Expand
...
1
2
3
4
5
...