Learn More
We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the rst few frequencies are strong. Another important observation is Parseval's theorem, which(More)
Given a water distribution network, where should we place sensors toquickly detect contaminants? Or, which blogs should we read to avoid missing important stories?. These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a(More)
We present an efficient indexing method to locate 1-dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. The idea is to map each data sequences into a small set of multidimensional rectangles in feature space. Then, these rectangles can be readily indexed using(More)
How do real graphs evolve over time? What are "normal" growth patterns in social, technological, and information networks? Many studies have discovered patterns in <i>static graphs</i>, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in- and out-degree distributions,(More)
How can we generate realistic networks? In addition, how can we do so with a mathematically tractable model that allows for rigorous analysis of network properties? Real networks exhibit a long list of surprising properties: Heavy tails for the in-and out-degree distribution, heavy tails for the eigenvalues and eigenvectors, small diameters, and(More)
How does a 'normal' computer (or social) network look like? How can we spot 'abnormal' sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (" how will a computer virus spread? "). The heart of the problem is finding the(More)
A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in <i>k</i>-d space, using <i>k</i> feature-extraction functions, provided by a domain expert [25]. Thus, we can subsequently use highly fine-tuned spatial access methods (SAMs), to answer several types of queries, including the 'Query By Example'(More)
Given a huge real graph, how can we derive a representative sample? There are many known algorithms to compute interesting measures (shortest paths, centrality, betweenness, etc.), but several of them become impractical for large graphs. Thus graph sampling is essential.The natural questions to ask are (a) which sampling method to use, (b) how small can the(More)
How do real graphs evolve over time&quest; What are normal growth patterns in social, technological, and information networks&quest; Many studies have discovered patterns in <i>static graphs</i>, identifying properties in a single snapshot of a large network or in a very small number of snapshots; these include heavy tails for in- and out-degree(More)
In the QBIC (Query By Image Content) project we are studying methods to query large on-line image databases using the images' content as the basis of the queries. Examples of the content we use include color, texture, and shape of image objects and regions. Potential applications include medical ("Give me other images that contain a tumor with a texture(More)