Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
gSpan: graph-based substructure pattern mining
- Xifeng Yan, Jiawei Han
- Mathematics, Computer ScienceIEEE International Conference on Data Mining…
- 9 December 2002
A novel algorithm called gSpan (graph-based substructure pattern mining), which discovers frequent substructures without candidate generation by building a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label.
PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks
- Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, Tianyi Wu
- Computer ScienceProc. VLDB Endow.
- 1 August 2011
Under the meta path framework, a novel similarity measure called PathSim is defined that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures.
CloSpan: Mining Closed Sequential Patterns in Large Datasets
Frequent pattern mining: current status and future directions
- Jiawei Han, Hong Cheng, Dong Xin, Xifeng Yan
- Computer ScienceData Mining and Knowledge Discovery
- 1 August 2007
It is believed that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run, however, there are still some challenging research issues that need to be solved before frequent patternmining can claim a cornerstone approach in data mining applications.
Graph indexing: a frequent structure-based approach
The gIndex approach not only provides and elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit form data mining, especially frequent pattern mining.
CloseGraph: mining closed frequent graph patterns
A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods and shows that it not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.
Mining Frequent Patterns in Data Streams at Multiple Time Granularities
This paper proposes computing and maintaining all the frequent patterns and dynamically updating them with the incoming data streams and incrementally maintain tilted-time windows for each pattern at multiple time granularities.
SOBER: statistical model-based bug localization
The result demonstrated the power of the approach in bug localization: SOBER can help programmers locate 68 out of 130 bugs in the Siemens suite when programmers are expected to examine no more than 10% of the code, whereas the best previously reported is 52 out of130.
Statistical Debugging: A Hypothesis Testing-Based Approach
- Chao Liu, Long Fei, Xifeng Yan, Jiawei Han, S. Midkiff
- Computer ScienceIEEE Transactions on Software Engineering
- 1 October 2006
A new statistical method, called SOBER, is proposed, which automatically localizes software faults without any prior knowledge of the program semantics and models the predicate evaluation in both correct and incorrect executions.
Discriminative Frequent Pattern Analysis for Effective Classification
- Hong Cheng, Xifeng Yan, Jiawei Han, Chih-Wei Hsu
- Computer ScienceIEEE 23rd International Conference on Data…
- 15 April 2007
This paper develops a strategy to set minimum support in frequent pattern mining for generating useful patterns, and demonstrates that the frequent pattern-based classification framework can achieve good scalability and high accuracy in classifying large datasets.