2.2.2 Concept Description versus On-line Analytical Processing

  • Published 2007


Descriptive data mining is to describe the general or special features of a set of data in concise manner. Mining characteristic or comparative descriptions of the data is an essential component of descriptive data mining. Data description is diierent from on-line analytical processing since (1) the former strives for more automated processing than the latter to help users determine which dimensions (or attributes) should be included in the analysis and how high level that the data set should be generalized in order to generate interesting summarization; and (2) the former strives for handling more complicated data types. Mining data characteristic and comparative descriptions can be implemented based on a data cube method or an attribute-oriented induction method. Also, data description can be enhanced by data dispersion analysis and multi-feature cubes. From the data analysis point of view, data mining can be classiied into descriptive data mining and predictive data mining. The former describes a set of data in a concise and summary manner and presents general properties of the data; whereas the latter constructs one or a set of models from the data and attempts to predict the behavior of new data sets. The simplest kind of descriptive data mining is concept description (or class description when the concept to be described refers to a class of objects). A concept usually refers to a collection of data, such as winners, frequent buyers, best sellers, and so on. As a data mining task, concept description is not simple enumeration of the data. Instead, it generates characteristic and/or comparative descriptions of the data: Concept characterization provides a concise and succinct summary of a concept, whereas concept comparison (also known as discrimination) provides a comparative summary of the concept being examined (often called the target class) in contrast to one or a set of comparative concepts (often called the contrasting classes). Besides description of data based on their general properties, one may also describe data based on their clustering or dispersion properties. Concept description, which characterizes a collection of data and compares it with others in a concise and succinct manner, is an essential task in data mining. Concept description can be presented in many forms, including generalized relation, cross-tabulation (or brieey, crosstab), chart, graph, etc. Also, it can be presented in the form of a logical rule. A rule, as conjunction of properties shared by all the entities in the class, is …

Cite this paper

@inproceedings{2007222CD, title={2.2.2 Concept Description versus On-line Analytical Processing}, author={}, year={2007} }