Harish Arsikere

Learn More
This paper presents an algorithm for automatically estimating speaker height. It is based on: (1) a recently-proposed model of the subglottal system that explains the inverse relation observed between subglottal resonances and height, and (2) an improved version of our previous algorithm for automatically estimating the second subglottal resonance (Sg2).(More)
This paper deals with the automatic estimation of the second subglot-tal resonance (Sg2) from natural speech spoken by adults, since our previous work focused only on estimating Sg2 from isolated diphthongs. A new database comprising speech and subglottal data of native American English (AE) speakers and bilingual Spanish/English speakers was used for the(More)
Recent research has demonstrated the usefulness of subglottal resonances (SGRs) in speaker normalization. However, existing algorithms for estimating SGRs from speech signals have limited applicability—they are effective with isolated vowels only. This paper proposes a novel algorithm for estimating the first three SGRs (Sg1; Sg2 and Sg3) from continuous(More)
Previous studies of subglottal resonances have reported findings based on relatively few subjects, and the relations between these resonances, subglottal anatomy, and models of subglottal acoustics are not well understood. In this study, accelerometer signals of subglottal acoustics recorded during sustained [a:] vowels of 50 adult native speakers (25(More)
This paper presents a large-scale study of subglottal resonances (SGRs) (the resonant frequencies of the tracheo-bronchial tree) and their relations to various acoustical and physiological characteristics of speakers. The paper presents data from a corpus of simultaneous microphone and accelerometer recordings of consonant-vowel-consonant (CVC) words(More)
The growth of Massive Open Online Courses (MOOCs) has been remarkable in the last few years. A significant amount of MOOCs content is in the form of videos and participants often use non-linear navigation to browse through a video. This paper proposes the design of a system that provides non-linear navigation in educational videos using features derived(More)
This paper proposes a non-linear frequency warping scheme for VTLN. It is based on mapping the subglottal resonances (SGRs) and the third formant frequency (F 3) of a given utterance to those of a reference speaker. SGRs are used because they relate to for-mants in specific ways while remaining phonetically invariant, and F 3 is used because it is somewhat(More)
Models and measurements of subglottal resonances are generally made from adult data, but there are several applications in which it would be useful to know about subglottal resonances in children. We therefore conducted an analysis of both new and old recordings of children's subglottal acoustics in order 1) to produce a fuller picture of the variability of(More)
This letter focuses on the automatic estimation of the first subglottal resonance (Sg1). A database comprising speech and subglottal data of native American English speakers and bilingual Spanish/English speakers was used for the analysis. Data from 11 speakers (five males and six females) were used to derive an empirical relation among the first formant(More)