Yiqun Zhang

Learn More
The decision tree induction learning is a typical machine learning approach which has been extensively applied for data mining and knowledge discovery. For numerical data and mixed data, discretization is an essential pre-processing step of decision tree learning. However, when coping with big data, most of the existing discretization approaches will not be(More)
Traditional hierarchical clustering (HC) methods are not scalable with the size of databases. To address this issue, a series of summarization techniques, i.e. data bubbles (DB) and its improved versions, have been proposed to compress very large databases into representative seed points suitable for subsequent hierarchy construction. However, DB and its(More)
  • 1