Learn More
Frequent pattern mining has been a focused theme in data mining research for over a decade. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining,(More)
The stability of the Wnt pathway transcription factor beta-catenin is tightly regulated by the multi-subunit destruction complex. Deregulated Wnt pathway activity has been implicated in many cancers, making this pathway an attractive target for anticancer therapies. However, the development of targeted Wnt pathway inhibitors has been hampered by the limited(More)
The goal of graph clustering is to partition vertices in a large graph into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Graph clustering techniques are very useful for detecting densely connected groups in a large graph. Many existing graph clustering methods mainly focus on the topological structure(More)
The application of frequent patterns in classification appeared in sporadic studies and achieved initial success in the classification of relational data, text documents and graphs. In this paper, we conduct a systematic exploration of frequent pattern-based classification, and provide solid reasons supporting this methodology. It was well known that(More)
With ever-increasing amounts of graph data from disparate sources, there has been a strong need for exploiting significant graph patterns with user-specified objective functions. Most objective functions are not antimonotonic, which could fail all of frequency-centric graph mining algorithms. In this paper, we give the first comprehensive study on general(More)
Graph-based semi-supervised learning has gained considerable interests in the past several years thanks to its effectiveness in combining labeled and unlabeled data through label propagation for better object modeling and classification. A critical issue in constructing a graph is the weight assignment where the weight of an edge specifies the similarity(More)
Chromosomal rearrangements fusing the androgen-regulated gene TMPRSS2 to the oncogenic ETS transcription factor ERG occur in approximately 50% of prostate cancers, but how the fusion products regulate prostate cancer remains unclear. Using chromatin immunoprecipitation coupled with massively parallel sequencing, we found that ERG disrupts androgen receptor(More)
The application of frequent patterns in classification has demonstrated its power in recent studies. It often adopts a two-step approach: frequent pattern (or classification rule) mining followed by feature selection (or rule ranking). However, this two-step process could be computationally expensive, especially when the problem scale is large or the(More)
Graph clustering, also known as community detection, is a long-standing problem in data mining. However, with the proliferation of rich attribute information available for objects in real-world graphs, how to leverage structural and attribute information for clustering attributed graphs becomes a new challenge. Most existing works take a distance-based(More)
As information networks become ubiquitous, extracting knowledge from information networks has become an important task. Both ranking and clustering can provide overall views on information network data, and each has been a hot topic by itself. However, ranking objects globally without considering which clusters they belong to often leads to dumb results,(More)