The KDD process for extracting useful knowledge from volumes of data

@article{Fayyad1996TheKP,
  title={The KDD process for extracting useful knowledge from volumes of data},
  author={Usama M. Fayyad and Gregory Piatetsky-Shapiro and Padhraic Smyth},
  journal={Commun. ACM},
  year={1996},
  volume={39},
  pages={27-34}
}
AS WE MARCH INTO THE AGE of digital information, the problem of data overload looms ominously ahead. Our ability to analyze and understand massive datasets lags far behind our ability to gather and store the data. A new generation of computational techniques and tools is required to support the extraction of useful knowledge from the rapidly growing volumes of data. These techniques and tools are the subject of the emerging field of knowledge discovery in databases (KDD) and data mining. Large… 

Figures from this paper

DATA MINING - A DOMAIN SPECIFIC ANALYTICAL TOOL FOR DECISION MAKING

TLDR
This paper focuses on applications of data mining, which is the nontrivial extraction of implicit, previously unknown and potentially useful information from data in databases to sift through volumes of data.

Knowledge Discovery Through Experiential Learning From Business and Other Contemporary Data Sources: A Review and Reappraisal

TLDR
The process of knowledge discovery in databases is reviewed, and selected methodologies, methods and tools, tasks, basic learning paradigms, and applications for knowledge generation by computer learning from data instances are described.

Performance Evaluation of Evolutionary and Decision Tree Based Classifiers in Diversity of Datasets

TLDR
This research is helpful for organizations to select the classifiers as information generator for their decision support systems to make future policies and demonstrates that evolutionary approach based classifier are slower than decision tree based classifiers.

Performance evaluation of decision tree versus artificial neural network based classifiers in diversity of datasets

TLDR
This research work shows that decision tree based classifier are better for organizational decision support systems as compared to ANN based classifiers.

Knowledge Discovery in Databases (KDD) with Images: A Novel Approach toward Image Mining and Processing

TLDR
The aim of this paper is to separate the important and unimportant pixels of an image using simple rules and one of the classification technique named as decision tree (ID-3) is applied for image compression.

Active Mining Project: Overview

TLDR
KDD emerges as a technique that extracts implicit, previously unknown, and potentially useful information (or patterns) from data that collectively achieves the various mining need.

Bm25 Ranking Algorithm Development Using Matching Concepts in Unstructured Text

TLDR
In this study, solutions are offered to fix the problem and the results will be evaluated for BM25 algorithm of ranking algorithms, which contains a combination of Persian and Latin texts that are sometimes caused problems for some loss of precision, accuracy and the recall algorithm.

From data to knowledge: implications of data mining

nowledge discovery differs from traditional information reieval from databases--in which database records (or tuples derived from fields of records) are returned in response to a query--in that what

Understanding the Classification of Data Mining and Web Mining

TLDR
Web Mining is part of data mining technology, which aims to extract interesting and useful hidden patterns and information from web documents and web activities.

Knowledge discovery from distributed and textual data

TLDR
This thesis examines the problems associated with knowledge discovery, focusing specifically on issues arising from the construction of a categorical classifier using distributed and textual data sources, and suggests the notion of data quality.
...

References

SHOWING 1-7 OF 7 REFERENCES

A statistical perspective on KDD

TLDR
Some major advances in statistics from recent decades that are applicable to Knowledge Discovery in Databases are reviewed.

Applications of machine learning and rule induction

TLDR
This paper aims to provide increasing levels of automation in the knowledge engineering process, replacing much time-consuming human activity with automatic techniques that improve accuracy or efficiency by discovering and exploiting regularities in training data.

Fast Discovery of Association Rules

Bayesian Networks for Knowledge Discovery

  • D. Heckerman
  • Computer Science
    Advances in Knowledge Discovery and Data Mining
  • 1996

From Data Mining to Knowledge Discovery: An Overview

Additional references for this article can be found at http://www.research.microsoft.com/research/datamine/CACM- DM-refs

  • Additional references for this article can be found at http://www.research.microsoft.com/research/datamine/CACM- DM-refs

Proceedings of KDD-95: The First International Conference on Knowledge Discovery and Data Mining

  • Proceedings of KDD-95: The First International Conference on Knowledge Discovery and Data Mining
  • 1995