Francesco Nesta

Learn More
Distant-microphone automatic speech recognition (ASR) remains a challenging goal in everyday environments involving multiple background sources and reverberation. This paper is intended to be a reference on the 2nd ’CHiME’ Challenge, an initiative designed to analyze and evaluate the performance of ASR systems in a real-world domestic environment. Two(More)
This paper proposes a new method of frequency-domain blind source separation (FD-BSS), able to separate acoustic sources in challenging conditions. In frequency-domain BSS, the time-domain signals are transformed into time-frequency series and the separation is generally performed by applying independent component analysis (ICA) at each frequency envelope.(More)
Distant-microphone automatic speech recognition (ASR) remains a challenging goal in everyday environments involving multiple background sources and reverberation. This paper reports on the results of the 2nd ’CHiME’ Challenge, an initiative designed to analyse and evaluate the performance of ASR systems in a real-world domestic environment. We discuss the(More)
This paper summarizes the audio part of the 2011 communitybased Signal Separation Evaluation Campaign (SiSEC2011). Four speech and music datasets were contributed, including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets. The participants addressed one or more tasks out of four source separation tasks, and the(More)
Blind source extraction (BSE) is an attractive approach to enhance multichannel noisy speech data, as a preprocessing step for an automatic speech recognition system. BSE was successfully applied to the first Chime Pascal Challenge for improving the recognition rate of noisy commands in a small dictionary task. In this work we reviewed the BSE architecture(More)
According to the physical meaning of the frequency-domain blind source separation (FD-BSS), each mixing matrix estimated by independent component analysis (ICA) contains information on the physical acoustic propagation related to each source and then can be used for localization purposes. In this paper, we analyze the Generalized State Coherence Transform(More)
This paper describes the system used to process the data of the CHiME Pascal 2011 competition, whose goal is to separate the desired speech and recognize the commands being spoken. The binaural recorded mixtures are processed by an on-line SemiBlind Source Extraction algorithm. The algorithm is based on a multi-stage architecture combining the advantages of(More)
A novel method is presented for a robust two channel multiple Time Difference of Arrival (TDOA) estimation for multispeaker localization which can provide satisfactory performance even in highly reverberant environment. The method is based on a recursive frequency-domain Independent Component Analysis (ICA) and on a novel State Coherence Transform (SCT).(More)
A novel method to solve the permutation problem for Blind Source Separation (BSS) is presented. According to the acoustic propagation model, in frequency-domain, each separation matrix can be represented with a set of states associated with each source. We formulate a novel transform of the states which is independent of the aliasing and of the permutations(More)