• Publications
  • Influence
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
TLDR
DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it. Expand
PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes
TLDR
This work developed PSORTb version 3.0 with improved recall, higher proteome-scale prediction coverage, and new refined localization subcategories, and evaluated the most accurate SCL predictors using 5-fold cross validation plus an independent proteomics analysis. Expand
A matrix factorization technique with trust propagation for recommendation in social networks
TLDR
A model-based approach for recommendation in social networks, employing matrix factorization techniques and incorporating the mechanism of trust propagation into the model demonstrates that modeling trust propagation leads to a substantial increase in recommendation accuracy, in particular for cold start users. Expand
Density-Based Clustering over an Evolving Data Stream with Noise
TLDR
A novel pruning strategy is designed based on these concepts, which guarantees the precision of the weights of the micro-clusters with limited memory, and demonstrates the effectiveness and efficiency of the method. Expand
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
TLDR
The generalized algorithm DBSCAN can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes, and four applications using 2D points (astronomy, 3D points,biology, 5D points and 2D polygons) are presented, demonstrating the applicability of GDBSCAN to real-world problems. Expand
PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis
TLDR
It is shown that the proportion of proteins at each localization is remarkably consistent across species, even in species with varying proteome size. Expand
Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
TLDR
It is demonstrated that the proposed model is a generalization of several well-known collaborative filtering models but with more flexible components, and that CDAE consistently outperforms state-of-the-art top-N recommendation methods on a variety of common evaluation metrics. Expand
TrustWalker: a random walk model for combining trust-based and item-based recommendation
TLDR
A random walk model combining the trust-based and the collaborative filtering approach for recommendation is proposed, which allows us to define and to measure the confidence of a recommendation. Expand
Hierarchical Document Clustering using Frequent Itemsets
TLDR
This paper proposes to use the notion of frequent itemsets, which comes from association rule mining, for document clustering, and shows that this method outperforms best existing methods in terms of both clustering accuracy and scalability. Expand
Frequent term-based text clustering
TLDR
Two algorithms for frequent term-based text clustering are presented, FTC which creates flat clusterings and HFTC for hierarchical clustering, which obtain clusterings of comparable quality significantly more efficiently than state-of-the- artText clustering algorithms. Expand
...
1
2
3
4
5
...