Learn More
This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. Early detection of anomalies plays a key role in maintaining consistency of person's data and protects corporations against malicious attackers. Current state of the art anomaly detection approaches suffer from scalability, use-case(More)
Pairwise Markov Networks (PMN) are an important class of Markov networks which, due to their simplicity, are widely used in many applications such as image analysis, bioinformatics, sensor networks, etc. However, learning of Markov networks from data is a challenging task; there are many possible structures one must consider and each of these structures(More)
Faced with the problem of characterizing systematic changes in multivariate time series in an unsupervised manner, we derive and test two methods of regularizing hidden Markov models for this task. Regularization on state transitions provides smooth transitioning among states, such that the sequences are split into broad, contiguous segments. Our methods(More)
Abstraction provides cognition economy and generalization skill in addition to facilitating knowledge communication for learning agents situated in real world. Concept learning introduces a way of abstraction which maps the continuous state and action spaces into entities called concepts. Of computational concept learning approaches, action-based(More)
In this paper, we propose a new framework for constructing text metrics which can be used to compare and support inferences among terms and sets of terms. Our metric is derived from data-driven kernels on graphs that let us capture global relations among terms and sets of terms, regardless of their complexity and size. To compute the metric efficiently for(More)
Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from the curse of dimensionality. Motivated by other machine learning models for which the existence of structure in the underlying distribution of(More)
In recent years, non-parametric methods utilizing random walks on graphs have been used to solve a wide range of machine learning problems, but in their simplest form they do not scale well due to the quadratic complexity. In this paper, a new dual-tree based variational approach for approximating the transition matrix and efficiently performing the random(More)
The dual-tree block partitioning approach, described in the main paper, depends on a known tree structure defining a rough hierarchical partition of data points (and kernels). The two trees–the data partition tree and the kernel partition tree–are identical in our framework. Clearly, the quality of the approximation for P and our ability to represent it(More)