Bassam Jabaian

Learn More
In this paper, several approaches for language portability of dialogue systems are investigated with a focus on the spoken language understanding (SLU) component. We show that the use of statistical machine translation (SMT) can greatly reduce the time and cost of porting an existing system from a source to a target language. Using automatically translated(More)
Machine learning algorithms are now common in the state-ofthe-art spoken language understanding models. But to reach good performance they must be trained on a potentially large amount of data which are not available for a variety of tasks and languages of interest. In this work, we present a novel zero-shot learning method, based on word embeddings,(More)
Portability of a spoken dialogue system (SDS) to a new domain or a new language is a hot topic as it may imply gains in time and cost for building new SDSs. In particular in this paper we investigate several fast and efficient approaches for language portability of the spoken language understanding (SLU) module of a dialogue system. We show that the use of(More)
The challenge with language portability of a spoken language understanding module is to be able to reuse the knowledge and the data available in a source language to produce knowledge in the target language. In this paper several approaches are proposed, motivated by the availability of the MEDIA French dialogue corpus and its manual translation into(More)
Many recent competitive state-of-the-art solutions for understanding of speech data have in common to be probabilistic and to rely on machine learning algorithms to train their models from large amount of data. The difficulty remains in the cost and time of collecting and annotating such data, but also to update the existing models to new conditions, tasks(More)
The PORTMEDIA project is intended to develop new corpora for the evaluation of spoken language understanding systems. The newly collected data are in the field of human-machine dialogue systems for tourist information in French in line with the MEDIA corpus. Transcriptions and semantic annotations, obtained by low-cost procedures, are provided to allow a(More)
Probabilistic approaches are now widespread in most natural language processing applications and selection of a particular approach usually depends on the task at hand. Targeting speech semantic interpretation in a multilingual context, this paper presents a comparison between the state-of-the-art methods used for machine translation and speech(More)
Following recent trends in the development of spoken dialogue systems, this paper proposes to improve the performance of the user’s intent extraction by means of joint decoding of automatic spoken language transcription and understanding. Gains are expected not only from a better connectivity and mutual awareness of both tasks but also through the use of(More)
As data-driven approaches started to make their way into the Natural Language Generation (NLG) domain, the need for automation of corpus building and extension became apparent. Corpus creation and extension in data-driven NLG domain traditionally involved manual paraphrasing performed by either a group of experts or with resort to crowd-sourcing. Building(More)