Learn More
Sequential pattern mining plays an important role in many applications, such as bioinformatics and consumer behavior analysis. However, the classic frequency-based framework often leads to many patterns being identified, most of which are not informative enough for business decision-making. In frequent pattern mining, a recent effort has been to incorporate(More)
Traditional data mining research mainly focus]es on developing, demonstrating, and pushing the use of specific algorithms and models. The process of data mining stops at pattern identification. Consequently, a widely seen fact is that 1) many algorithms have been designed of which very few are repeatable and executable in the real world, 2) often many(More)
Recognition of protein folding patterns is an important step in protein structure and function predictions. Traditional sequence similarity-based approach fails to yield convincing predictions when proteins have low sequence identities, while the taxonometric approach is a reliable alternative. From a pattern recognition perspective, protein fold(More)
Collaborative filtering (CF) is a major technique in recommender systems to help users find their potentially desired items. Since the data sparsity problem is quite commonly encountered in real-world scenarios, Cross-Domain Collaborative Filtering (CDCF) hence is becoming an emerging research topic in recent years. However, due to the lack of sufficient(More)
Outlier detection is an important problem that has been studied within diverse research areas and application domains. Most existing methods are based on the assumption that an example can be exactly categorized as either a normal class or an outlier. However, in many real-life applications, data are uncertain in nature due to various errors or partial(More)
Data mining increasingly faces complex challenges in the real-life world of business problems and needs. The gap between business expectations and R&D results in this area involves key aspects of the field, such as methodologies, targeted problems, pattern interestingness, and infrastructure support. Both researchers and practitioners are realizing the(More)
  • Longbing Cao
  • 2008 IEEE International Conference on Data Mining…
  • 2008
In deploying data mining into the real-world business, we have to cater for business scenarios, organizational factors, user preferences and business needs. However, the current data mining algorithms and tools often stop at the delivery of patterns satisfying expected technical interestingness. Business people are not informed about how and what to do to(More)
High utility sequential pattern mining is an emerging topic in the data mining community. Compared to the classic frequent sequence mining, the utility framework provides more informative and actionable knowledge since the utility of a sequence indicates business value and impact. However, the introduction of "utility" makes the problem fundamentally(More)