• Publications
  • Influence
MEAN TEACHER WITH DATA AUGMENTATION FOR DCASE 2019 TASK 4 Technical Report
TLDR
A mean-teacher model with convolutional neural network (CNN) and recurrent neuralnetwork (RNN) together with data augmentation and a median window tuned for each class based on prior knowledge is proposed.
Frame-synchronous stochastic matching based on the Kullback-Leibler information
TLDR
This work chooses to model speech by hidden Markov models (HMMs) in the cepstrum domain and the mismatch is reduced by a parametric function, and presents a frame synchronous estimation of these parameters.
Online SLU model adaptation with a partial oracle
TLDR
A supervised approach for updating the SLU models of a deployed SDS which doesn’t need any additional manual transcription or annotation processes and is given by the users calling the SDS.
Gaussian density tree structure in a multi-Gaussian HMM-based speech recognition system
This paper presents a Gaussian density tree structure usage which enables a computational cost reduction without a significant degradation of recognition performances, during a continuous speech
Frame-synchronous adaptation of cepstrum by linear regression
TLDR
Recognition experiments carried out on both PSTN and GSM networks show the efficiency of the proposed method: with a model trained on PSTN recorded digits, the error rate can be reduced with bias subtraction and by 36% with linear regression.
Robust speech recognition techniques evaluation for telephony server based in-car applications
TLDR
The feasibility of designing a speech-recognition based telephony server for in-car applications with an acceptable recognition rate is investigated and the gain of using either a robust sound recording device or noise robust front-end is demonstrated.
Exploiting semantic relations for a spoken language understanding application
TLDR
This article proposes a new confidence measure estimated for concept hypotheses provided by a semantic language model used in the context of a dialog application based upon the ontology and more precisely, upon the semantic relations between concepts.
Signal bias removal using the multi-path stochastic equalization technique
TLDR
This work applies the MUlti-path Stochastic Equalization framework to perform bias removal in the cepstral domain in order to increase the robustness of automatic speech recognizers.
About improving recognition of spontaneously uttered French city-names
TLDR
This paper deals with the recognition of French city-names over the telephone, which involves a 40,000 city-name vocabulary, ranging from short monosyllabic words to long official compound-names, and several ways of improving speech recognition performance are investigated.
...
...