Learn More
We describe a statistical signature of chunks and an algorithm for finding chunks. While there is no formal definition of chunks, they may be reliably identified as configurations with low internal entropy or unpredictability and high entropy at their boundaries. We show that the log frequency of a chunk is a measure of its internal entropy. The(More)
We propose a novel batch active learning method that leverages the availability of high-quality and efficient sequential active-learning policies by approximating their behavior when applied for k steps. Specifically, our algorithm uses Monte-Carlo simulation to estimate the distribution of unlabeled examples selected by a sequential policy over k steps.(More)
This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The Voting-Experts algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two " expert methods " decide where in the window boundaries should be drawn. The algorithm successfully(More)
Let us call a sequence of numbers heapable if they can be sequentially inserted to form a binary tree with the heap property, where each insertion subsequent to the first occurs at a leaf of the tree, i.e. below a previously placed number. In this paper we consider a variety of problems related to heapable sequences and subsequences that do not appear to(More)
We give a (ln n+1)-approximation for the decision tree (DT) problem. An instance of DT is a set of m binary tests T=(T 1,…,T m ) and a set of n items X=(X 1,…,X n ). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the total external path length of the tree is minimized. Total external path length is the sum(More)
We consider the problem of finding minimum reset sequences in synchronizing automata. The well-knowň Cern´y conjecture states that every n-state synchronizing automaton has a reset sequence with length at most (n − 1) 2. While this conjecture gives an upper bound on the length of every reset sequence , it does not directly address the problem of finding the(More)
The Hierarchical Agent Control Architecture (HAC) is a general toolkit for specifying an agent's behavior. HAC supports action abstraction, resource management, sensor integration, and is well suited to controlling large numbers of agents in dynamic environments. It relies on three hierarchies: action, sensor, and context. The action hierarchy controls the(More)
This paper describes an unsupervised olgorirhm f o r segmenting categorical time series inro episodes. The VOTING-EXPERTS algorithm first collects starisrics about the frequency and boundav entmpy of ngrams. then passes a window over rhe series and has two " expert methods " decide where in rhe window boundaries should be drawn. The algorirhm successfully(More)
We introduce the Constrained Subtree Selection (CSS) problem as a model for the optimal design of websites. Given a hierarchy of topics represented as a DAG G and a probability distribution over the topics, we select a subtree of the transitive closure of G which minimizes the expected path cost. We define path cost as the sum of the page costs along a path(More)