Mohamed Morchid

Learn More
This article presents two methods for the automatic detection of social events that were evaluated on the annotated set of pictures as part of the 2011 Mediaeval benchmark [1]. The first method uses a set of web pages and a semantic space obtained by Latent Dirichelet Allocation (LDA, [2, 3]) to classify pictures from Flickr. The second approach uses the(More)
Although the current transcription systems could achieve high recognition performance, they still have a lot of difficulties to transcribe speech in very noisy environments. The transcription quality has a direct impact on classification tasks using text features. In this paper, we propose to identify themes of telephone conversation services with the(More)
In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of telephone conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini)(More)
The paper introduces new features for describing possible focus variation in a human/human conversation. The application considered is a real-life telephone customer care service. The purpose is to hypothesize the dominant theme of conversations between a casual customer calling. Conversations are processed by an automatic speech recognition system that(More)
In this paper, we present a method of tweet contextualization by using a semantic space to extend the tweet vocabulary. This method is evaluated on the tweet contextualization benchmark. Contextualiza-tion is build with the sentences from English Wikipedia. The context is obtained by querying a baseline system of summary. The query is made with words from a(More)
In this paper, we describe the LIA system proposed for the MediaEval 2013 Spoken Web Search task. This multi-language task involves searching for an audio content query, in a database, with no training resources available. The participants must then find locations of each given query term within a large database of untranscribed audio files. For this task,(More)
—We present a method to detect social events in a set of pictures from an image hosting service (Flickr). This method relies on the analysis of user-generated tags, by using statistical models trained on both a small set of manually annotated data and a large data set collected from the Internet. Social event modeling relies on multi-span topic model based(More)
Various studies highlighted that topic-based approaches give a powerful spoken content representation of documents. Nonetheless, these documents may contain more than one main theme, and their automatic transcription inevitably contains errors. In this study, we propose an original and promising framework based on a compact representation of a textual(More)