Juan-Manuel Torres-Moreno

Learn More
We present SMMR, a scalable sentence scoring method for query-oriented update summarization. Sentences are scored thanks to a criterion combining query relevance and dissimilarity with already read documents (history). As the amount of data in history increases, non-redundancy is prioritized over query-relevance. We show that SMMR achieves promising results(More)
This paper presents two corpora produced within the RPM2 project: a multi-document summarization corpus and a sentence compression corpus. Both corpora are in French. The first one is the only one we know in this language. It contains 20 topics with 20 documents each. A first set of 10 documents per topic is summarized and then the second set is used to(More)
Resumen: Hoy en día el análisis discursivo automático es un tema de investigación relevante. Sin embargo, no existen analizadores del discurso para textos en español. El primer paso para desarrollar esta herramienta es la segmentación discursiva. En este artículo presentamos DiSeg, el primer segmentador discursivo para el español que utiliza el marco de la(More)
We study correlation of rankings of text summarization systems using evaluation methods with and without human models. We apply our comparison framework to various well-established content-based evaluation measures in text sum-marization such as coverage, Responsiveness , Pyramids and ROUGE studying their associations in various text summarization tasks(More)
Availability of labeled language resources, such as annotated corpora and domain dependent labeled language resources is crucial for experiments in the field of Natural Language Processing. Most often, due to lack of resources, manual verification and annotation of electronic text material is a prerequisite for the development of NLP tools. In the context(More)
—We study a new content-based method for the evaluation of text summarization systems without human models which is used to produce system rankings. The research is carried out using a new content-based evaluation framework called FRESA to compute a variety of divergences among probability distributions. We apply our comparison framework to various(More)
Since information in electronic form is already a standard, and that the variety and the quantity of information become increasingly large, the methods of summarizing or automatic condensation of texts is a critical phase of the analysis of texts. This article describes Cortex a system based on numerical methods, which allows obtaining a condensation of a(More)
In this paper we present a Neural Network approach, inspired by statistical physics of magnetic systems, to study fundamental problems of Natural Language Processing (NLP). The algorithm models documents as neural network whose Textual Energy is studied. We obtained good results on the application of this method to automatic summarization and Topic(More)
Compilation of Languages for Specific Purposes (LSP) corpora is a task which is fraught with several difficulties (mainly time and human effort), because it is not easy to discern between specialized and non-specialized text. The aim of this work is to study automatic specialized vs. non-specialized sentence differentiation. The experiments are carried out(More)