Charles C. Broun

Learn More
Speech not only conveys the linguistic information, but also characterizes the talker's identity and therefore can be used in personal authentication. While most of the speech information is contained in the acoustic channel, the lip movement during speech production also provides useful information. In this paper we investigate the effectiveness of visual(More)
There has been growing interest in introducing speech as a new modality into the human-computer interface (HCI). Motivated by the multimodal nature of speech, the visual component is considered to yield information that is not always present in the acoustic signal and enables improved system performance over acoustic-only methods, especially in noisy(More)
Mainstream automatic speech recognition has focused almost exclusively on the acoustic signal. The performance of these systems degrades considerably in the real world in the presence of noise. On the other hand, most human listeners, both hearing-impaired and normal hearing, make use of visual information to improve speech perception in acoustically(More)
Speech recognition is continually being realized as a user interface in new applications. As this technology progresses, it enables new ways for humans to interact with machines and information. The performance in many domains has approached users’ expectations. Although there are still abundant technology challenges ahead, speech recognition has reached a(More)
With the advent of Wireless Application Protocol (WAP) and 2.5/3G communication systems, the mobile device has become a window to the Internet. A natural interface to this mobile device is through speech. To address this need, a new European Telecommunications Standards Institute (ETSI) standard front-end has evolved for Distributed Speech Recognition(More)
Biometrics is gaining strong support for the personalization of and the securing of mobile devices. It is not uncommon for individual users to be faced with a half-dozen or more passwords and personal identification numbers. The ubiquity of passwords actually relaxes system security since many users tend to use the same password across all applications, or(More)
Computationally scalable speaker recognition systems are highly desirable in practice. To achieve this objective, we use a two-stage architecture for text-prompted speaker recognition. In this system, the input speech is first segmented on subword boundaries using a Viterbi alignment. The second stage applies a polynomial classifier to each subword for(More)
Biometrics is gaining strong support for access control in the industry. It is not uncommon for individual users to be faced with a half-dozen or more passwords and personal identification numbers (PINs) controlling access to the systems required for them to do their job. The ubiquity of passwords actually relaxes system security since many users tend to(More)
A novel system for text-prompted speaker recognition is presented. The system first segments the speech by Viterbi alignment with speaker independent models. It then applies a polynomial classifier to each subword for recognition. This methodology has several interesting aspects. First, the system has excellent computational scalability for identification.(More)