Luca Rigazio

Learn More
Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and(More)
Our previous study on maximum relative margin estimation (MRME) of HMM (C. Liu et al., 2005) demonstrated its advantage over the standard minimum classification error (MCE) training. In this paper, we report our recent improvement on MRME. Specifically, two novel approaches are proposed to handle recognition errors in training sets for the MRME. One is a(More)
The “eigenvoice” technique achieves rapid speaker adaptation by employing prior knowledge of speaker space obtained from reference speakers to place strong constraints on the initial model for each new speaker [9,10]. It has recently been shown to yield very fast adaptation for a large-vocabulary system [3] ([5] modifies the technique in an interesting(More)
In this paper, we summarize systems submitted by PSTL to the evaluation. We ran Meta-Data (MD) on Switchboard (SWB) and Broadcast News (BN) data. Speech-to-text systems were built and tested on both SWB and BN systems with limited real-time constraints. For our first participation, our systems were characterized by low complexity, exploratory operating(More)
This paper presents a new speech feature representation using a wavelet decomposition of speech signal called subband analysis. This parameterization derives cepstral coefficients from the output of an unbalanced tree-structured filter-bank combining high-pass and low-pass filters with downsampling units. Inspired from the SUBCEP analysis of [1] and [2],(More)
We present an optimized implementation of the Viterbi algorithm suitable for small to large vocabulary, and isolated or continuous speech recognition. The Viterbi algorithm is certainly the most popular dynamic programming algorithm used in speech recognition. In this paper we propose a new algorithm that outperforms the Viterbi algorithm in term of(More)
Linear feature space transformations are often used for speaker or environment adaptation. Usually, numerical methods are sought to obtain solutions. In this paper, we derive a closed-form solution to ML estimation of full feature transformations. Closed-form solutions are desirable because the problem is quadratic and thus blind numerical analysis may(More)
In this paper we address the problem of speaker adaptation in noisy environments. We estimate speaker adapted models from noisy data by combining unsupervised speaker adaptation with noise compensation. We aim at using the resulting speaker adapted models in environments that differ from the adaptation environment, without a significant loss in performance.(More)