• Publications
  • Influence
A Framework for Clustering Evolving Data Streams
TLDR
This paper discusses a fundamentally different philosophy for data stream clustering which is guided by application-centered requirements. Expand
  • 1,760
  • 234
  • PDF
Transfer Feature Learning with Joint Distribution Adaptation
TLDR
We put forward a novel transfer learning solution, referred to as Joint Distribution Adaptation (JDA), to jointly adapt both the marginal and conditional distributions in a principled dimensionality reduction procedure, and construct new feature representation that is effective and robust for substantial distribution difference. Expand
  • 756
  • 195
  • PDF
Top 10 algorithms in data mining
TLDR
This paper presents the top 10 data mining algorithms identified by the International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. Expand
  • 4,170
  • 191
  • PDF
PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks
TLDR
We introduce a meta path-based similarity framework for objects that are defined among the same type of objects in heterogeneous networks. Expand
  • 1,020
  • 178
  • PDF
Mining concept-drifting data streams using ensemble classifiers
TLDR
We train an ensemble of classification models, such as C4.5, RIPPER, naive Beyesian, etc., from sequential chunks of the data stream using weighted ensemble classifiers. Expand
  • 1,351
  • 131
  • PDF
Fast algorithms for projected clustering
TLDR
The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. Expand
  • 1,113
  • 122
  • PDF
A holistic lexicon-based approach to opinion mining
TLDR
We propose a holistic lexicon-based approach to solving the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews. Expand
  • 1,335
  • 102
  • PDF
A Comprehensive Survey on Graph Neural Networks
TLDR
We propose a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent, convolutional, graph autoencoders, and spatial–temporalGNNs. Expand
  • 1,316
  • 99
  • PDF
Building text classifiers using positive and unlabeled examples
TLDR
This paper studies the problem of building two-class classifiers with only positive and unlabeled examples, but no negative examples. Expand
  • 609
  • 97
  • PDF
BLINKS: ranked keyword searches on graphs
TLDR
We propose BLINKS, a bi-level indexing and query processing scheme for top-k keyword search on graphs, which provides orders-of-magnitude performance improvement over existing approaches. Expand
  • 594
  • 96
  • PDF
...
1
2
3
4
5
...