Yue Xian Zou

Learn More
By exploring the time-frequency (TF) sparsity property of the speech, the inter-sensor data ratios (ISDRs) of single acoustic vector sensor (AVS) have been derived and investigated. Under noiseless condition, ISDRs have favorable properties, such as being independent of frequency, DOA related with single valuedness, and no constraints on near or far field(More)
The performance of DOA estimation with scalar sensor arrays using spatial sparse signal reconstruction (SSR) technique is affected by the grid spacing. In this paper, we formulate the DOA estimation with the acoustic vector sensor (AVS) arrays under SSR framework. A coarse-to-fine DOA estimation algorithm has been developed. The source spatial sparsity and(More)
This paper proposes a novel approach to detecting multiple, simultaneous talkers in multi-party meetings using localisation of active speech sources recorded with an ad-hoc microphone array. Cues indicating the relative distance between sources and microphones are derived from speech signals and room impulse responses recorded by each of the microphones(More)
Accurate DOA estimation based on clustering the inter-sensor data ratios (ISDRs) of a single acoustic vector sensor (AVS), referred as AVS-ISDR, relies on reliable extraction of time-frequency points with high local signal-to-noise ratio (HLSNR-TFPs) and its performance degrades in noisy environments. This paper investigates deep neural networks (DNNs)(More)
This paper investigates the formation of ad-hoc microphone arrays for the purpose of recording multiple sound sources by clustering microphones spatially distributed within a room. A novel codebook-based unsupervised method for cluster formation using features derived from the Room Impulse Responses (RIRs) corresponding to each microphone is proposed and(More)
This paper investigates speaker direction of arrival (DOA) estimation using a single acoustic vector sensor (AVS). With the definition of the inter-sensor data ratio (ISDR) in the time-frequency (TF) domain and the use of the high local signal-to-noise ratio (HLSNR) TF points, an effective ISDR data model is derived, which determines the relationship(More)