Share This Author
From Data Mining to Knowledge Discovery in Databases
An overview of this emerging field is provided, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases.
Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning
This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuous-valued attribute into multiple intervals.
From Data Mining to Knowledge Discovery: An Overview
The KDD process for extracting useful knowledge from volumes of data
A new generation of computational techniques and tools is required to support the extraction of useful knowledge from the rapidly growing volumes of data, the subject of the emerging field of knowledge discovery in databases (KDD) and data mining.
Advances in Knowledge Discovery and Data Mining
- U. Fayyad, G. Piatetsky-Shapiro, Padhraic Smyth, R. Uthurusamy
- Computer ScienceLecture Notes in Computer Science
This talk will discuss the issues and focus on how to mine evolving data streams and preserve privacy and the characteristics of the data stream can change over time and the evolving pattern needs to be captured.
Refining Initial Points for K-Means Clustering
A procedure for computing a refined starting condition from a given initial one that is based on an efficient technique for estimating the modes of a distribution that allows the iterative algorithm to converge to a “better” local minimum.
Knowledge Discovery and Data Mining: Towards a Unifying Framework
The KDD process and basic data mining algorithms are defined, links between data mining, knowledge discovery, and other related fields are described, and an analysis of challenges facing practitioners in the field is analyzed.
Scaling Clustering Algorithms to Large Databases
A scalable clustering framework applicable to a wide class of iterative clustering that requires at most one scan of the database and is instantiated and numerically justified with the popular K-Means clustering algorithm.
Hierarchical Clustering Algorithms for Document Datasets
The experimental evaluation shows that, contrary to the common belief, partitional algorithms always lead to better solutions than agglomerative algorithms; making them ideal for clustering large document collections due to not only their relatively low computational requirements, but also higher clustering quality.
Knowledge Discovery in Databases: An Overview
- U. Fayyad
- Computer ScienceILP
- 17 September 1997
This talk defines the basic notions in data mining and KDD, defines the goals, present motivation, and gives a high-level definition of the KDD Process and how it relates to Data Mining, and focuses on data mining methods.