Learn More
One of the major problems in pattern mining is the explosion of the number of results. Tight constraints reveal only common knowledge, while loose constraints lead to an explosion in the number of returned patterns. This is caused by large groups of patterns essentially describing the same set of transactions. In this paper we approach this problem using(More)
—Given a snapshot of a large graph, in which an infection has been spreading for some time, can we identify those nodes from which the infection started to spread? In other words, can we reliably tell who the culprits are? In this paper we answer this question affirmatively, and give an efficient method called NETSLEUTH for the well-known(More)
How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the 'importance' of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a 'vocabulary' of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of(More)
An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard frequent pattern miners do not achieve this goal, as due to the pattern explosion typically very large numbers of highly redundant patterns are returned. We pursue the ideal for(More)
Knowledge discovery from data is an inherently iterative process. That is, what we know about the data greatly determines our expectations, and therefore, what results we would find interesting and/or surprising. Given new knowledge about the data, our expectations will change. Hence, in order to avoid redundant results, knowledge discovery algorithms(More)
Vehicular travel is increasing throughout the world, particularly in large urban areas. Therefore the need arises for simulating and optimizing traffic control algorithms to better accommodate this increasing demand. In this paper we study the simulation and optimization of traffic light controllers in a city and present an adaptive optimization algorithm(More)
—Statistical assessment of the results of data mining is increasingly recognised as a core task in the knowledge discovery process. It is of key importance in practice, as results that might seem interesting at first glance can often be explained by well-known basic properties of the data. In pattern mining, for instance, such trivial results can be so(More)