Learn More
We present an ensemble method for concept drift that dynamically creates and removes weighted experts in response to changes in performance. The method, dynamic weighted majority (DWM), uses four mechanisms to cope with concept drift: It trains online learners of the ensemble, it weights those learners based on their performance, it removes them, also based(More)
This paper considers additive factorial hidden Markov models, an extension to HMMs where the state factors into multiple independent chains, and the output is an additive function of all the hidden states. Although such models are very powerful, accurate inference is unfortunately difficult: exact inference is not computationally tractable, and existing(More)
Algorithms for tracking concept drift are important for many applications. We present a general method based on the Weighted Majority algorithm for using any on-line learner for concept drift. Dynamic Weighted Majority (dwm) maintains an ensemble of base learners, predicts using a weighted-majority vote of these " experts " , and dynamically creates and(More)
We describe the use of machine learning and data mining to detect and classify malicious exe-cutables as they appear in the wild. We gathered 1, 971 benign and 1, 651 malicious executables and encoded each as a training example using n-grams of byte codes as features. Such processing resulted in more than 255 million distinct n-grams. After selecting the(More)
We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (LSTD) algorithm, provide a method for learning the parameters of the value function, but when the number of features is large this algorithm can over-fit to the data and is(More)
This paper considers the sparse Gaussian conditional random field, a discriminative extension of sparse inverse covariance estimation , where we use convex methods to learn a high-dimensional conditional distribution of outputs given inputs. The model has been proposed by multiple researchers within the past year, yet previous papers have been substantially(More)