• Publications
  • Influence
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
TLDR
DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it. Expand
LOF: identifying density-based local outliers
TLDR
This paper contends that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier, called the local outlier factor (LOF), and gives a detailed formal analysis showing that LOF enjoys many desirable properties. Expand
The R*-tree: an efficient and robust access method for points and rectangles
TLDR
The R*-tree is designed which incorporates a combined optimization of area, margin and overlap of each enclosing rectangle in the directory which clearly outperforms the existing R-tree variants. Expand
OPTICS: ordering points to identify the clustering structure
TLDR
A new algorithm is introduced for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. Expand
A Three-Way Model for Collective Learning on Multi-Relational Data
TLDR
This work presents a novel approach to relational learning based on the factorization of a three-way tensor that is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorizations. Expand
Integrating structured biological data by Kernel Maximum Mean Discrepancy
TLDR
A novel statistical test of whether two samples are from the same distribution, compatible with both multivariate and structured data, that is fast, easy to implement, and works well, as confirmed by the experiments. Expand
The X-tree : An Index Structure for High-Dimensional Data
TLDR
A new organization of the directory is introduced which uses a split algorithm minimizing overlap and additionally utilizes the concept of supernodes to keep the directory as hierarchical as possible, and at the same time to avoid splits in the directory that would result in high overlap. Expand
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
TLDR
The generalized algorithm DBSCAN can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes, and four applications using 2D points (astronomy, 3D points,biology, 5D points and 2D polygons) are presented, demonstrating the applicability of GDBSCAN to real-world problems. Expand
Shortest-path kernels on graphs
  • K. Borgwardt, H. Kriegel
  • Mathematics, Computer Science
  • Fifth IEEE International Conference on Data…
  • 27 November 2005
TLDR
This work proposes graph kernels based on shortest paths, which are computable in polynomial time, retain expressivity and are still positive definite, and shows significantly higher classification accuracy than walk-based kernels. Expand
LOF: identifying density-based local outliers
TLDR
This paper contends that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier, called the local outlier factor (LOF), and gives a detailed formal analysis showing that LOF enjoys many desirable properties. Expand
...
1
2
3
4
5
...