Manuel Montes-y-Gómez

Learn More
Automatic image annotation (AIA), a highly popular topic in the field of information retrieval research, has experienced significant progress within the last decade. Yet, the lack of a standardized evaluation platform tailored to the needs of AIA, has hindered effective evaluation of its methods, especially for region-based AIA. Therefore in this paper, we(More)
In this paper we present a new Question Answering (QA) system based on redundancy and a new Passage Retrieval (PR) method oriented to QA. We suppose that in a large enough document collection we can find the answer to any question of several forms. Therefore, it is possible to find one or several sentences including the answers which contain part of the(More)
This paper proposes the use of local histograms (LH) over character n-grams for authorship attribution (AA). LHs are enriched histogram representations that preserve sequential information in documents; they have been successfully used for text categorization and document visualization using word histograms. In this work we explore the suitability of LHs(More)
The use of conceptual graphs for the representation of text contents in information retrieval is discussed. A method for measuring the similarity b etween two texts represented as conceptual graphs is presented. The method is based on well-known strategies of text comparison, such as Dice coefficient, with new elements introduced due to the bipartite nature(More)
This paper proposes the application of particle swarm optimization (PSO) to the problem of full model selection, FMS, for classification tasks. FMS is defined as follows: given a pool of preprocessing methods, feature selection and learning algorithms, to select the combination of these that obtains the lowest classification error for a given data set; the(More)
Character n-grams have been identified as the most successful feature in both singledomain and cross-domain Authorship Attribution (AA), but the reasons for their discriminative value were not fully understood. We identify subgroups of character n-grams that correspond to linguistic aspects commonly claimed to be covered by these features: morphosyntax,(More)
In this paper we describe the participation of the Laboratory of Language Technologies of INAOE at PAN 2014. We address the Author Profiling (AP) task finding and exploiting relationships among terms, documents, profiles and subprofiles. Our approach uses the idea of second order attributes (a lowdimensional and dense document representation) [4], but goes(More)
The discovery of association rules is one of the classic problems of data mining. Typically, it is done over well-structured data, such as databases. In this paper, we present a method of discovery of association rules in semi-structured data, namely, in a set of conceptual graphs. The method is based on conceptual clustering of the data and constructing of(More)
This paper describes the participation of the Laboratory of Language Technologies of INAOE at PAN 2013 evaluation lab. We adopted second order representations for facing the problem of Author Profiling (AP). This representation tackles two shortcomings of the typical Bag-of-Terms: i) the sparsity and high dimensionality of document representations, and ii)(More)