#### Filter Results:

- Full text PDF available (105)

#### Publication Year

1997

2017

- This year (4)
- Last 5 years (72)
- Last 10 years (164)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

In this paper we present the Slim-tree, a dynamic tree for organizing metric datasets in pages of fixed size. The Slim-tree uses the "fat-factor" which provides a simple way to quantify the degree of overlap between the nodes in a metric tree. It is well-known that the degree of overlap directly affects the query performance of index structures. There are… (More)

- Caetano Traina, Agma J. M. Traina, Christos Faloutsos, Bernhard Seeger
- IEEE Trans. Knowl. Data Eng.
- 2002

!"#$%&'$B2"&C 3(4(&$ '"$"D"#( "EE.*4"$*0&# 8-#$ '(". F*$@ #*8*."3*$C G-(3*(#= !03 #-4@ "EE.*4"$*0&#> *$ *# *8E03$"&$ $0 8("#-3( $@( #*8*."3*$C D($F((& $F0 0DH(4$# -#*&+ $@( '*#$"&4( D($F((& $@(8= !04-#*&+ 0& $@*# E30D.(8> $@*# E"E(3 E30E0#(# $@( 6.*89$3((> " &(F 'C&"8*4 $3(( 103 03+"&*/*&+ 8($3*4 '"$" #($# *& E"+(# 01 1*)(' #*/(= :@( 6.*89$3(( -#(# $@(… (More)

Dimensionality curse and dimensionality reduction are two issues that have retained high interest for data mining, machine learning, multimedia indexing, and clustering. We present a fast, scalable algorithm to quickly select the most important attributes (dimensions) for a given set of n-dimensional vectors. In contrast to older methods, our method has the… (More)

Designing a new access method inside a commercial DBMS is cumbersome and expensive. We propose a family of metric access methods that are fast and easy to implement on top of existing access methods, such as sequential scan, R-trees

- Marcos R. Vieira, Humberto Luiz Razente, +4 authors Vassilis J. Tsotras
- 2011 IEEE 27th International Conference on Data…
- 2011

In this paper we describe a general framework for evaluation and optimization of methods for diversifying query results. In these methods, an initial ranking candidate set produced by a query is used to construct a result set, where elements are ranked with respect to relevance and diversity features, i.e., the retrieved elements should be as relevant as… (More)

- Caetano Traina, Roberto F. Santos Filho, Agma J. M. Traina, Marcos R. Vieira, Christos Faloutsos
- The VLDB Journal
- 2005

Similarity search operations require executing expensive algorithms, and although broadly useful in many new applications, they rely on specific structures not yet supported by commercial DBMS. In this paper we discuss the new Omni-technique, which allows to build a variety of dynamic Metric Access Methods based on a number of selected objects from the… (More)

- Christos Faloutsos, Bernhard Seeger, Agma J. M. Traina, Caetano Traina
- SIGMOD Conference
- 2000

We discovered a surprising law governing the spatial join selectivity across two sets of points. An example of such a spatial join is “<i>find the libraries that are within 10 miles of schools</i>”. Our law dictates that the number of such qualifying pairs follows a power law, whose exponent we call “pair-count exponent” (PC). We… (More)

Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbor queries. Current methods improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height-balanced trees). The Slim-tree and the M-tree are the most… (More)

Given a very large moderate-to-high dimensionality dataset, how could one cluster its points? For datasets that don't fit even on a single disk, parallelism is a first class option. In this paper we explore MapReduce for clustering this kind of data. The main questions are (a) how to minimize the I/O cost, taking into account the already existing data… (More)