Learn More
—Outlier mining is a major task in data analysis. Outliers are objects that highly deviate from regular objects in their local neighborhood. Density-based outlier ranking methods score each object based on its degree of deviation. In many applications, these ranking methods degenerate to random list-ings due to low contrast between outliers and regular(More)
Clustering high dimensional data is an emerging research field. Subspace clustering or projected clustering group similar objects in subspaces, i.e. projections, of the full space. In the past decade, several clustering paradigms have been developed in parallel, without thorough evaluation and comparison between these paradigms on a common basis. Conclusive(More)
—Subspace clustering aims at detecting clusters in any subspace projection of a high dimensional space. As the number of possible subspace projections is exponential in the number of dimensions, the result is often tremendously large. Recent approaches fail to reduce results to relevant subspace clusters. Their results are typically highly redundant, i.e.(More)
Subspace clustering and frequent itemset mining via " step-by-step " algorithms that search the subspace/pattern lattice in a top-down or bottom-up fashion do not scale to large high dimensional data bases. Recent " jump " algorithms directly choose candidate subspace regions or patterns. Their scalability and quality depend heavily on the rating of these(More)
To gain insight into today's large data resources, data mining provides automatic aggregation techniques. Clustering aims at grouping data such that objects within groups are similar while objects in different groups are dissimilar. In scenarios with many attributes or with noise, clusters are often hidden in subspaces of the data and do not show up in the(More)
Knowledge discovery in databases requires not only development of novel mining techniques but also fair and comparable quality assessment based on objective evaluation measures. Especially in young research areas where no common measures are available, researchers are unable to provide a fair evaluation. Typically, publications glorify the high quality of(More)
In the knowledge discovery process, clustering is an established technique for grouping objects based on mutual similarity. However, in today's applications for each object very many attributes are provided. As multiple concepts described by different attributes are mixed in the same data set, clusters do not appear in all dimensions. In these high(More)
— Outlier analysis is an important data mining task that aims to detect unexpected, rare, and suspicious objects. Outlier ranking enables enhanced outlier exploration, which assists the user-driven outlier analysis. It overcomes the binary detection of outliers vs. regular objects, which is not adequate for many applications. Traditional outlier ranking(More)
Finding groups of similar objects in databases is one of the most important data mining tasks. Recently, traditional clustering approaches have been extended to generate alternative clustering solutions. The basic observation is that for each database object multiple meaningful groupings might exist: the data allows to be clustered through different(More)