• Publications
  • Influence
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
TLDR
We develop a novel technique, called VGRAM, to improve the performance of these algorithms and show the significant performance improvements on three existing algorithms. Expand
  • 198
  • 14
  • PDF
Efficiently Indexing Large Sparse Graphs for Similarity Search
TLDR
This paper focuses on the index structure for similarity search on a set of large sparse graphs and proposes an efficient indexing mechanism by introducing the Q-Gram idea. Expand
  • 72
  • 10
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently
TLDR
We analyze how a gram dictionary affects the index structure of the string collection and ultimately the performance of queries. Expand
  • 99
  • 5
  • PDF
Distance-Based Outlier Detection on Uncertain Data
TLDR
We propose a new definition of distance-based outlier on uncertain data, dynamic programming method was used for detecting this kind of outlier. Expand
  • 49
  • 5
Most Influential Community Search over Large Social Networks
TLDR
We propose a new community model, maximal kr-Clique community, which has desirable properties, i.e., society, cohesiveness, connectivity, and maximum. Expand
  • 58
  • 4
A Novel Representation and Compression for Queries on Trajectories in Road Networks
TLDR
In this paper, we propose a novel representation, a lossless compression for both spatial path and timestamps, and an error-bounded compression for locations. Expand
  • 9
  • 3
Maximum Co-located Community Search in Large Scale Social Networks
TLDR
We consider the constraint of users’ spatial information in k-truss search, denoted as co-located community search in this paper. Expand
  • 33
  • 2
  • PDF
Protecting Individual Information Against Inference Attacks in Data Publishing
TLDR
We study how to protect sensitive data when an adversary can do inference attacks using association rules derived from the data. Expand
  • 19
  • 2
  • PDF
Mapping Referential Integrity Constraints from Relational Databases to XML
XML is rapidly emerging as the dominant standard for exchanging data on the WWW. Most of application data are stored in relational databases due to its popularity and rich development experiencesExpand
  • 7
  • 2
A Generalization Based Approach for Anonymizing Weighted Social Network Graphs
TLDR
The increasing popularity of social networks, such as online communities and telecommunication systems, has generated interesting knowledge discovery and data mining problems. Expand
  • 26
  • 1