Lixin Fu

Learn More
We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST ++(More)
Being able to efficiently answer arbitrary OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes has been a continued, major concern in data warehousing. In this paper, we introduce a new data structure, called Statistics Tree (ST), together with an efficient algorithm called CubiST, for evaluating ad-hoc(More)
Existing decision tree algorithms need to recursively partition dataset into subsets according to some splitting criteria. For large data sets, this requires multiple passes of original dataset and therefore is often infeasible in many applications. In this article we use statistics trees to compute the data cube and then build a decision tree on top of it.(More)
We present a novel approach to speeding up the evaluation of OLAP queries that return aggregates over dimensions containing hierarchies. Our approach is based on our previous version of CubiST (Cubing with Statistics Trees), which pre-computes and stores all possible aggregate views in the leaves of a statistics tree during a one-time scan of the data.(More)
Privacy-Preserving Data Mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this chapter the review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. The authors observe that the rapid pace of(More)
Using Internet-enabled mobile handheld devices to access the World Wide Web is a promising addition to the Web and traditional e-commerce. Mobile handheld devices provide convenience and portable access to the huge information on the Internet for mobile users from anywhere and at anytime. However, mobile commerce has not enjoyed the same level of success as(More)
Keys are character based tools for plant identification. They are based on the decomposition of the plant into very small, atomistic parts. These parts are described with the technical and often arcane terminology of plant taxonomy. Even the best electronic keys (Delta, Lucid) make use of this terminology. Keys are not based on pattern recognition, the(More)
Cloud Computing is a paradigm in which data, applications or software are accessed over a network. This network of servers is called as “Cloud”. Using a client such as desktops, entertainment centers, tablet computers, notebooks, wall computers, handhelds etc, users can reach into the cloud for resources as they need them. Cloud computing is(More)
Computing data cubes requires the aggregation of measures over arbitrary combinations of dimensions in a data set. Efficient data cube evaluation remains challenging because of the potentially very large sizes of input datasets (e.g., in the data warehousing context), the well-known curse of dimensionality, and the complexity of queries that need to be(More)
introduction Since the late '80s and early '90s, database technologies have evolved to a new level of applications: online analytical processing (OLAP), where executive management can make quick and effective strategic decisions based on knowledge in terms of queries against large amounts of stored data. Some OLAP systems are also regarded as decision(More)