Fuliang Weng

Learn More
Social media language contains huge amount and wide variety of nonstandard tokens, created both intentionally and unintentionally by the users. It is of crucial importance to normalize the noisy nonstandard tokens before applying other NLP techniques. A major challenge facing this task is the system coverage, i.e., for any user-created nonstandard term, the(More)
Most text message normalization approaches are based on supervised learning and rely on human labeled training data. In addition, the nonstandard words are often categorized into different types and specific models are designed to tackle each type. In this paper, we propose a unified letter transformation approach that requires neither pre-categorization(More)
Spoken dialogue interfaces, mostly command-and-control, become more visible in applications where attention needs to be shared with other tasks, such as driving a car. The deployment of the simple dialog systems, instead of more sophisticated ones, is partly because the computing platforms used for such tasks have been less powerful and partly because(More)
Joint compression and summarization has been used recently to generate high quality summaries. However, such word-based joint optimization is computationally expensive. In this paper we adopt the ‘sentence compression + sentence selection’ pipeline approach for compressive summarization, but propose to perform summary guided compression, rather than generic(More)
The potential benefit of integrating contextual information for recommendation has received much research attention recently, especially with the ever-increasing interest in mobile-based recommendation services. However, context based recommendation research is limited due to the lack of standard evaluation data with contextual information and reliable(More)
We explore the relationship between question answering and constraint relaxation in spoken dialogue systems, and develop dialogue strategies for selecting and presenting information succinctly. In particular, we describe methods for dealing with the results of database queries in information-seeking dialogues. Our goal is to structure the dialogue in such a(More)
This paper describes a data collection process aimed at gathering human-computer dialogs in high-stress or “busy” domains where the user is concentrating on tasks other than the conversation, for example, when driving a car. Designing spoken dialog interfaces for such domains is extremely challenging and the data collected will help us improve the dialog(More)
Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect ASR systems. To cope with these effects, we propose to use ratespecific phone models and pronunciations for ROS modeling at the word level. Words are given three types of pronunciations — fast, slow, and medium — consisting of rate-specific phone(More)
We propose to use user simulation for testing during the development of a sophisticated dialog system. While the limited behaviors of the state-of-the-art user simulation may not cover important aspects in the dialog system testing, our proposed approach extends the functionality of the simulation so that it can be used at least for the early stage testing(More)