Humberto Luiz Razente

Learn More
In this paper we describe a general framework for evaluation and optimization of methods for diversifying query results. In these methods, an initial ranking candidate set produced by a query is used to construct a result set, where elements are ranked with respect to relevance and diversity features, i.e., the retrieved elements should be as relevant as(More)
With the availability of very large databases, an exploratory query can easily lead to a vast answer set, typically based on an answer’s relevance (i.e., top-k, tf-idf ) to the user query. Navigating through such an answer set requires huge effort and users give up after perusing through the first few answers, thus some interesting answers hidden further(More)
Similarity search has received large attention on modern database applications involving complex objects, since the queries executed in such databases are seldom based on exact matches but rather on some specific notion of similarity. However, the SQL query language does not provide effective support for similarity queries. This paper proposes the addition(More)
A similarity query considers an element as the query center and searches a dataset to find either the elements far up to a bounding radius or the <i>k</i> nearest ones from the query center. Several algorithms have been developed to efficiently execute similarity queries. However, there are queries that require more than one center, which we call Aggregate(More)
This paper presents a new Picture Archiving and Communication System (PACS), called cbPACS, which has content-based image retrieval capabilities. The cbPACS answers range and k-nearest- neighbor similarity queries, employing a relational database manager extended to support images. The images are compared through their features, which are extracted by an(More)
Scalable data mining algorithms have become crucial to efficiently support KDD processes on large databases. In this paper, we address the task of scaling up k-medoids based algorithms through the utilization of metric access methods, allowing clustering algorithms to be executed by database management systems in a fraction of the time usually required by(More)
Completely automated data analysis techniques often fail to meet their requirements, due to their inability to exploit peripheral knowledge associated with the data. Human beings are very good at interpreting data represented in graphical format, and usually have the wisdom to recognize the associated knowledge. This paper addresses this dichotomy through a(More)