Learn More
Periodicity mining is used for predicting trends in time series data. Discovering the rate at which the time series is periodic has always been an obstacle for fully automated periodicity mining. Existing periodicity mining algorithms assume that the periodicity, rate (or simply the period) is user-specified. This assumption is a considerable limitation,(More)
Periodicity mining is used for predicting trends in time series data. Periodicity detection is an essential process in periodicity mining to discover potential periodicity rates. Existing periodicity detection algorithms do not take into account the presence of noise, which is inevitable in almost every real-world time series data. In this paper, we tackle(More)
Data cleaning is a vital process that ensures the quality of data stored in real-world databases. Data cleaning problems are frequently encountered in many research areas, such as knowledge discovery in databases, data ware-housing, system integration and e-services. The process of identifying the record pairs that represent the same entity (duplicate(More)
The mining of periodic patterns in time series databases is an interesting data mining problem that can be envisioned as a tool for forecasting and predicting the future behavior of time series data. Existing periodic patterns mining algorithms either assume that the periodic rate (or simply the period) is user-specified, or try to detect potential values(More)
We present the demonstration of the design of "STEAM", Purdue Boiler Makers' stream database system that allows for the processing of continuous and snap-shot queries over data streams. Specifically, the demonstration focuses on the query processing engine, "Nile". Nile extends the query processor engine of an object-relational database management system,(More)
Mining of periodic patterns in time-series databases is an interesting data mining problem. It can be envisioned as a tool for forecasting and prediction of the future behavior of time-series data. Incremental mining refers to the issue of maintaining the discovered patterns over time in the presence of more items being added into the database. Because of(More)
For languages with rich content over the web, business reviews are easily accessible via many known websites, e.g., Yelp.com. For languages with poor content over the web like Arabic, there are very few websites (we are actually aware of only one that is indeed unpopular) that provide business reviews. However, this does not mean that such reviews do not(More)
In an error-free system with perfectly clean data, the construction of a global view of the data consists of linking – in relational terms, joining – two or more tables on their key fields. Unfortunately, most of the time, these data are neither carefully controlled for quality nor necessarily defined commonly across different data sources. As a result, the(More)
The role of data resources in today's business environment is multi-faceted. Primarily, they support the operational needs of an organization or a company. Secondarily, they can be used for decision support and management. The quality of the data, used to support the operational needs, is usually below the quality required for decision support and(More)
Sensor devices are becoming ubiquitous, especially in measurement and monitoring applications. Because of the real-time, append-only and semi-infinite natures of the generated sensor data streams, an online incremental approach is a necessity for mining stream data types. In this paper, we propose STAGGER: a one-pass, online and incremental algorithm for(More)