Marie-Francine Moens

Learn More
A topic model outputs a set of multinomial distributions over words for each topic. In this paper, we investigate the value of bilingual topic models, i.e., a bilingual Latent Dirichlet Allocation model for finding translations of terms in comparable corpora without using any linguistic resources. Experiments on a document-aligned English-Italian Wikipedia(More)
Sentiment analysis, also called opinion mining, is a form of information extraction from text of growing research and commercial interest. In this paper we present our machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and written in English, Dutch and French. We train from a set of(More)
In this paper, we extend current state-of-theart research on unsupervised acquisition of scripts, that is, stereotypical and frequently observed sequences of events. We design, evaluate and compare different methods for constructing models for script event prediction: given a partial chain of events in a script, predict other events that are likely to(More)
We propose a new unified framework for monolingual (MoIR) and cross-lingual information retrieval (CLIR) which relies on the induction of dense real-valued word vectors known as word embeddings (WE) from comparable data. To this end, we make several important contributions: (1) We present a novel word representation learning model called Bilingual Word(More)
Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private information. Phishing has increased enormously over the(More)
The growing stream of content placed on the Web provides a huge collection of textual resources. People share their experiences on-line, ventilate their opinions (and frustrations), or simply talk just about anything. The large amount of available data creates opportunities for automatic mining and analysis. The information we are interested in this paper,(More)
Argumentation is the process by which arguments are constructed and handled. Argumentation constitutes a major component of human intelligence. The ability to engage in argumentation is essential for humans to understand new problems, to perform scientific reasoning, to express, to clarify and to defend their opinions in their daily lives. Argumentation(More)
Follow up what we will offer in this article about information extraction algorithms and prospects in a retrieval context. You know really that this book is coming as the best seller book today. So, when you are really a good reader or you're fans of the author, it does will be funny if you don't have this book. It means that you have to get this book. For(More)
This article reports on the novel task of <i>spatial role labeling</i> in natural language text. It proposes machine learning methods to extract spatial roles and their relations. This work experiments with both a step-wise approach, where spatial prepositions are found and the related trajectors, and landmarks are then extracted, and a joint learning(More)
This paper provides the results of experiments on the detection of arguments in texts among which are legal texts. The detection is seen as a classification problem. A classifier is trained on a set of annotated arguments. Different feature sets are evaluated involving lexical, syntactic, semantic and discourse properties of the texts. The experiments are a(More)