Panayiotis G. Georgiou

Learn More
Long speech-text alignment can facilitate large-scale study of rich spoken language resources that have recently become widely accessible, e.g., collections of audio books, or multimedia documents. For such resources, the conventional Viterbibased forced alignment may often be proven inadequate mainly due to mismatched audio and text and/or noisy audio. In(More)
ASTERIA I was a 40-week, randomized, double-blind, placebo-controlled study to evaluate the efficacy and safety of subcutaneous omalizumab as add-on therapy for 24 weeks in patients with chronic idiopathic urticaria/spontaneous urticaria (CIU/CSU) who remained symptomatic despite H1 antihistamine treatment at licensed doses. Patients aged 12-75 years with(More)
Interaction synchrony among interlocutors happens naturally as people adapt their speaking style gradually to promote efficient communication. In this work, we quantify one aspect of interaction synchrony prosodic entrainment, specifically pitch and energy, in married couples’ problem-solving interactions using speech signal-derived measures. Statistical(More)
In this paper we describe the first phase of development of our speech-to-speech system between English and Modem Persian under the DARPA Babylon program. We give an overview of the various system components: the front end ASR, the machine translation system and the speech generation system. Challenges such as the sparseness of available spoken language(More)
| The expression and experience of human behavior are complex and multimodal and characterized by individual and contextual heterogeneity and variability. Speech and spoken language communication cues offer an important means for measuring and modeling human behavior. Observational research and practice across a variety of domains from commerce to(More)
A new representation of audio noise signals is proposed, based on symmetric -stable (S S) distributions in order to better model the outliers that exist in real signals. This representation addresses a shortcoming of the Gaussian model, namely, the fact that it is not well suited for describing signals with impulsive behavior. The -stable and Gaussian(More)
The goal of this work is to build a real-time emotion detection system which utilizes multi-modal fusion of different timescale features of speech. Conventional spectral and prosody features are used for intra-frame and supra-frame features respectively, and a new information fusion algorithm which takes care of the characteristics of each machine learning(More)
The ability to build topic specific language models, rapidly and with minimal human effort, is a critical need for fast deployment and portability of ASR across different domains. The World Wide Web (WWW) promises to be an excellent textual data resource for creating topic specific language models. In this paper we describe an iterative web crawling(More)
Our long-term objective is to create Smart Room Technologies that are aware of the users presence and their behavior and can become an active, but not an intrusive, part of the interaction. In this work, we present a multimodal approach for estimating and tracking the location and identity of the participants including the active speaker. Our smart room(More)