DATA MINING TECHNIQUES

@inproceedings{Zaki2003DATAMT,
  title={DATA MINING TECHNIQUES},
  author={Mohammed J. Zaki and Limsoon Wong},
  year={2003}
}
Data mining is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. [...] Key Method The techniques covered include association rules, sequence mining, decision tree classification, and clustering. Some aspects of preprocessing and postprocessing are also covered. The problem of predicting contact maps for protein sequences is used as a detailed case study. The material presented here is compiled by LW based on the original…Expand
Principles of Data Mining
  • M. Bramer
  • Computer Science
  • Undergraduate Topics in Computer Science
  • 2007
A Comparative Study Of Data Clustering Techniques
TLDR
This study tells us about the comparison between data mining techniques on the basis of size, model, application areas and others features and tells us when and which datamining techniques are used. Expand
A Review of Data Mining Techniques
Information technology has revolutionized the whole world with cheaper and fast communication through different modes. All these devices generate lots of data which need to be processed to extractExpand
Data Mining: Next Generation Challenges and Future Directions
TLDR
The significance of the application of data mining in different areas, challenges its future directions and it is pointed out that the data mining technology is becoming more and more powerful. Expand
Data Mining Techniques on Medical Data for Finding Locally Frequent Diseases
In the last decade there has been increasing usage of data mining techniques on medical data for discovering useful patterns or trends which are used in decision making and diagnosis. Data miningExpand
Anomaly Detection in Data Mining: A Review
TLDR
The main goal is to detect the anomaly in time series data using machine learning techniques. Expand
Anomaly Detection in Data Mining using Fuzzy C-Means Technique and Artificial Neural Network
TLDR
The main goal is to detect the anomaly in time series data using machine learning techniques to make it less open to attack. Expand
A Survey on Data Mining in Big Data
Collection of large and complex data is termed as big data. Tons of data are collected in applications such as medical processing, whether reporting, digital libraries, etc. and these data should beExpand
Data mining tools
TLDR
Criteria for the tool categorization based on different user groups, data structures, data mining tasks and methods, visualization and interaction styles, import and export options for data and models, platforms, and license policies are proposed. Expand
A Study on Milestones of Association Rule Mining Algorithms in Large Databases
TLDR
The important concepts of Association rule mining and existing algorithms and their effectiveness and drawbacks are provided and the main theoretical issues and guiding the researcher in an interesting research directions that have yet to be discovered are covered. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 95 REFERENCES
Data Mining for Scientific and Engineering Applications
TLDR
It is shown that the diversity of applications, the richness of the problems faced by practitioners, and the opportunity to borrow ideas from other domains, make scientific data mining an exciting and challenging field. Expand
Data Mining: Concepts and Techniques
TLDR
This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data. Expand
Scalable Algorithms for Association Mining
TLDR
Efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the association mining task are presented and the effect of using different database layout schemes combined with the proposed decomposition and traverse techniques are presented. Expand
Mining association rules: anti-skew algorithms
  • Jun-Lin Lin, M. Dunham
  • Computer Science
  • Proceedings 14th International Conference on Data Engineering
  • 1998
TLDR
This work proposes several techniques which overcome the problem of data skew in the basket data, and employs prior knowledge collected during the mining process and/or via sampling, to further reduce the number of candidate itemsets and identify false candidate itemset at an earlier stage. Expand
Efficient algorithms for mining outliers from large data sets
TLDR
A novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor is proposed and the top n points in this ranking are declared to be outliers. Expand
Efficient mining of emerging patterns: discovering trends and differences
  • Guozhu Dong, Jinyan Li
  • Computer Science
  • KDD '99
  • 1999
TLDR
It is believed that EPs with low to medium support, such as 1%-20%, can give useful new insights and guidance to experts, in even “well understood” applications. Expand
An Efficient Algorithm for Mining Association Rules in Large Databases
TLDR
This paper presents an efficient algorithm for mining association rules that is fundamentally different from known algorithms and not only reduces the I/O overhead significantly but also has lower CPU overhead for most cases. Expand
BIRCH: an efficient data clustering method for very large databases
TLDR
A data clustering method named BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) is presented, and it is demonstrated that it is especially suitable for very large databases. Expand
PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth
  • J. Pei, Jiawei Han, +4 authors M. Hsu
  • Computer Science
  • Proceedings 17th International Conference on Data Engineering
  • 2001
TLDR
This work proposes a novel sequential pattern mining method, called Prefixspan (i.e., Prefix-projected - Ettern_ mining), which explores prejxprojection in sequential pattern Mining, and shows that Pre fixspan outperforms both the Apriori-based GSP algorithm and another recently proposed method; Frees pan, in mining large sequence data bases. Expand
Algorithms for Mining Distance-Based Outliers in Large Datasets
TLDR
This paper provides formal and empirical evidence showing the usefulness of DB-outliers and presents two simple algorithms for computing such outliers, both having a complexity of O(k N’), k being the dimensionality and N being the number of objects in the dataset. Expand
...
1
2
3
4
5
...