Data Set Used
An emerging problem in Data Streams is the detection of concept drift. This problem is aggravated when the drift is gradual over time. In this work we define a method for detecting concept drift, even in the case of slow gradual change. It is based on the estimated distribution of the distances between classification errors. The proposed method can be used… (More)
* Este trabajo ha sido parcialmente financiado por el proyecto MOISES, número TIC-2002-04019-C03-02 del Ministerio de Ciencia y Tecnología.
Social networks play an increasingly important role in shaping the behaviour of users of the Web. Conceivably Twitter stands out from the others, not only for the platform's simplicity but also for the great influence that the messages sent over the network can have. The impact of such messages determines the influence of a Twitter user and is what tools… (More)
Incremental learning is an approach to deal with the classification task when datasets are too large or when new examples can arrive at any time. One possible approach uses concentration bounds (like Chernoff or Hoeffding bounds) to ensure that expansions are done when the number of examples supports the change. Two algorithms that use this approach are… (More)
Learning in data streams is a problem of growing interest. The target function of data streams may change over time, so in such situations, a learning model induced with some previous data may be inconsistent with the current data. This problem is commonly known as concept drift. The strategy broadly used to handle concept drift is to continuously monitor a… (More)
Real-time classification of massive email data is a challenging task that presents its own particular difficulties. Since email data presents an important temporal component, several problems arise: emails arrive continuously, and the criteria used to classify those emails can change, so the learning algorithms have to be able to deal with concept drift.… (More)