Learn More
This paper presents an algorithm for automatically estimating speaker height. It is based on: (1) a recently-proposed model of the subglottal system that explains the inverse relation observed between subglottal resonances and height, and (2) an improved version of our previous algorithm for automatically estimating the second subglottal resonance (Sg2).(More)
Recent research has demonstrated the usefulness of subglottal resonances (SGRs) in speaker normalization. However, existing algorithms for estimating SGRs from speech signals have limited applicability—they are effective with isolated vowels only. This paper proposes a novel algorithm for estimating the first three SGRs (Sg1; Sg2 and Sg3) from continuous(More)
This letter focuses on the automatic estimation of the first subglottal resonance (Sg1). A database comprising speech and subglottal data of native American English speakers and bilingual Spanish/English speakers was used for the analysis. Data from 11 speakers (five males and six females) were used to derive an empirical relation among the first formant(More)
The growth of Massive Open Online Courses (MOOCs) has been remarkable in the last few years. A significant amount of MOOCs content is in the form of videos and participants often use non-linear navigation to browse through a video. This paper proposes the design of a system that provides non-linear navigation in educational videos using features derived(More)
This paper presents a large-scale study of subglottal resonances (SGRs) (the resonant frequencies of the tracheo-bronchial tree) and their relations to various acoustical and physiological characteristics of speakers. The paper presents data from a corpus of simultaneous microphone and accelerometer recordings of consonant-vowel-consonant (CVC) words(More)
Current speech-input systems typically use a nonspeech threshold for end-of-utterance detection. While usually sufficient for short utterances, the approach can cut speakers off during pauses in more complex utterances. We elicit personal-assistant speech (reminders, calendar entries, messaging, search) using a recognizer with a dramatically increased(More)
Previous studies of subglottal resonances have reported findings based on relatively few subjects, and the relations between these resonances, subglottal anatomy, and models of subglottal acoustics are not well understood. In this study, accelerometer signals of subglottal acoustics recorded during sustained [a:] vowels of 50 adult native speakers (25(More)
This paper deals with the automatic estimation of the second subglottal resonance (Sg2) from natural speech spoken by adults, since our previous work focused only on estimating Sg2 from isolated diphthongs. A new database comprising speech and subglottal data of native American English (AE) speakers and bilingual Spanish/English speakers was used for the(More)
This letter investigates the use of MFCCs and GMMs for 1) improving the state of the art in speaker height estimation, and 2) rapid estimation of subglottal resonances (SGRs) without relying on formant and pitch tracking (unlike our previous algorithm in [1]). The proposed system comprises a set of height-dependent GMMs modeling static and dynamic MFCC(More)
This paper proposes a non-linear frequency warping scheme for VTLN. It is based on mapping the subglottal resonances (SGRs) and the third formant frequency (F3) of a given utterance to those of a reference speaker. SGRs are used because they relate to formants in specific ways while remaining phonetically invariant, and F3 is used because it is somewhat(More)