Maria Camila Nardini Barioni

Learn More
In this paper we describe a general framework for evaluation and optimization of methods for diversifying query results. In these methods, an initial ranking candidate set produced by a query is used to construct a result set, where elements are ranked with respect to relevance and diversity features, i.e., the retrieved elements should be as relevant as(More)
With the availability of very large databases, an exploratory query can easily lead to a vast answer set, typically based on an answer’s relevance (i.e., top-k, tf-idf ) to the user query. Navigating through such an answer set requires huge effort and users give up after perusing through the first few answers, thus some interesting answers hidden further(More)
Similarity search has received large attention on modern database applications involving complex objects, since the queries executed in such databases are seldom based on exact matches but rather on some specific notion of similarity. However, the SQL query language does not provide effective support for similarity queries. This paper proposes the addition(More)
This paper presents a similarity retrieval engine - SIREN-that allows posing similarity queries in a relational DBMS using an extended syntax that adds the support for such type of queries in the SQL language. It discusses the main architecture of SIREN, describes some key features and provides a description of the demo.
A similarity query considers an element as the query center and searches a dataset to find either the elements far up to a bounding radius or the <i>k</i> nearest ones from the query center. Several algorithms have been developed to efficiently execute similarity queries. However, there are queries that require more than one center, which we call Aggregate(More)
Scalable data mining algorithms have become crucial to efficiently support KDD processes on large databases. In this paper, we address the task of scaling up k-medoids based algorithms through the utilization of metric access methods, allowing clustering algorithms to be executed by database management systems in a fraction of the time usually required by(More)
Completely automated data analysis techniques often fail to meet their requirements, due to their inability to exploit peripheral knowledge associated with the data. Human beings are very good at interpreting data represented in graphical format, and usually have the wisdom to recognize the associated knowledge. This paper addresses this dichotomy through a(More)
Content-based image retrieval techniques rely on automatic features extracted from images to process similarity queries. Usually low-level features are extracted, and when they are used to compare images stored in a database to a reference image (through single center selection queries), they often lack the ability to convey to the users what they(More)