Learn More
Feature extraction and selection for continuous speech recognition is a complex task. State of the art speech recognition systems use features that are derived by ignoring the Fourier transform phase. In our earlier studies we have shown the efficacy of The Modified Group Delay Feature (MODGDF) derived from the Fourier transform phase for phoneme, syllable(More)
Spectral representation of speech is complete when both the Fourier transform magnitude and phase spectra are specified. In conventional speech recognition systems, features are generally derived from the short-time magnitude spectrum. Although the importance of Fourier transform phase in speech perception has been realized, few attempts have been made to(More)
In the development of a syllable-centric ASR system, segmentation of the acoustic signal into syllabic units is an important stage. This paper presents a minimum phase group delay based approach to segment spontaneous speech into syllable-like units. Here, three different minimum phase signals are derived from the short term energy functions of three(More)
Subspace-based source localization methods utilize the spectral magnitude of the MUltiple SIgnal Classification (MUSIC) method. However, in all these methods, a large number of sensors are required to resolve closely spaced sources. A novel method for high resolution source localization based on the group delay of MUSIC is described in this work. The method(More)
Conventionally the spectral magnitude of MUSIC is used for efficient beam forming and clean speech acquisition from distant microphones. The MUSIC method is unable to resolve closely spaced DOAs with a computationally plausible number of sensors. In this paper we propose the use of the group delay function computed from theMUSIC phase spectrum for efficient(More)
Automatic identification of bird calls without manual intervention has been a challenging task for meaningful research on the taxonomy and monitoring of bird migrations in ornithology. In this paper we apply several techniques used in speech recognition to the automatic identification of bird calls. A new technique which computes the ensemble average on the(More)
In the development of a syllable-centric ASR system, seg-mentation of the acoustic signal into syllabic units is an important stage. This paper presents a minimum phase group delay based approach to segment spontaneous speech into syllable-like units. Here, three different minimum phase signals are derived from the short term energy functions of three(More)
Recommended by Climent Nadeu This paper investigates the significance of combining cepstral features derived from the modified group delay function and from the short-time spectral magnitude like the MFCC. The conventional group delay function fails to capture the resonant structure and the dynamic range of the speech spectrum primarily due to pitch(More)
This paper discusses the significance of joint cepstral features derived from the modified group delay function and MFCC in speech processing. We start with a definition of cepstral features derived from the modified group delay function called the modified group delay feature (MODGDF) which is derived from the Fourier transform phase. Robustness issues(More)
—In this paper we describe the design concepts and prototype implementation of a situation aware ubiquitous computing system using multiple modalities such as National Marine Electronics Association (NMEA) data from global positioning system (GPS) receivers, text, speech, environmental audio, and handwriting inputs. While most mobile and communication(More)