gSpan: graph-based substructure pattern mining
- Xifeng Yan, Jiawei Han
- Computer ScienceIEEE International Conference on Data Mining…
- 9 December 2002
A novel algorithm called gSpan (graph-based substructure pattern mining), which discovers frequent substructures without candidate generation by building a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label.
PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks
- Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, Tianyi Wu
- Computer ScienceProceedings of the VLDB Endowment
- 1 August 2011
Under the meta path framework, a novel similarity measure called PathSim is defined that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures.
CloSpan: Mining Closed Sequential Patterns in Large Datasets
- Xifeng Yan, Jiawei Han, Ramin Afshar
- Computer ScienceSDM
- 2003
Frequent pattern mining: current status and future directions
- Jiawei Han, Hong Cheng, Dong Xin, Xifeng Yan
- Computer ScienceData mining and knowledge discovery
- 1 August 2007
It is believed that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run, however, there are still some challenging research issues that need to be solved before frequent patternmining can claim a cornerstone approach in data mining applications.
Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting
- SHIYANG LI, Xiaoyong Jin, Xifeng Yan
- Computer ScienceNeural Information Processing Systems
- 29 June 2019
First, convolutional self-attention is proposed by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism, and LogSparse Transformer is proposed, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget.
CloseGraph: mining closed frequent graph patterns
- Xifeng Yan, Jiawei Han
- Computer ScienceKnowledge Discovery and Data Mining
- 24 August 2003
A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods and shows that it not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.
Graph indexing: a frequent structure-based approach
- Xifeng Yan, Philip S. Yu, Jiawei Han
- Computer ScienceACM SIGMOD Conference
- 13 June 2004
The gIndex approach not only provides and elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit form data mining, especially frequent pattern mining.
Mining Frequent Patterns in Data Streams at Multiple Time Granularities
- C. Giannella, Jiawei Han, Xifeng Yan, Philip S. Yu
- Computer Science
- 2002
This paper proposes computing and maintaining all the frequent patterns and dynamically updating them with the incoming data streams and incrementally maintain tilted-time windows for each pattern at multiple time granularities.
SOBER: statistical model-based bug localization
- Chao Liu, Xifeng Yan, Long Fei, Jiawei Han, S. Midkiff
- Computer ScienceESEC/FSE-13
- 5 September 2005
The result demonstrated the power of the approach in bug localization: SOBER can help programmers locate 68 out of 130 bugs in the Siemens suite when programmers are expected to examine no more than 10% of the code, whereas the best previously reported is 52 out of130.
Statistical Debugging: A Hypothesis Testing-Based Approach
- Chao Liu, Long Fei, Xifeng Yan, Jiawei Han, S. Midkiff
- Computer ScienceIEEE Transactions on Software Engineering
- 1 October 2006
A new statistical method, called SOBER, is proposed, which automatically localizes software faults without any prior knowledge of the program semantics and models the predicate evaluation in both correct and incorrect executions.
...
...