Risi Thonangi

Learn More
Manually tuning tens to hundreds of configuration parameters in a complex software system like a database or an application server is an arduous task. Recent work has looked into automated approaches for recommending good configuration settings that adaptively search the full space of possible configurations. These approaches are based on conducting(More)
In this paper, we look at the problem of assigning labels to nodes of a dynamic XML tree such that the labels encode all ancestor-descendant relationships between the nodes and the document-order between the nodes. Such labeling facilitates efficient XML query processing. A number of labeling schemes have been designed for this task. These schemes can be(More)
We consider the problem of efficiently computing weighted proximity best-joins over multiple lists, with applications in information retrieval and extraction. We are given a multi-term query, and for each query term, a list of all its matches with scores, sorted by locations. The problem is to find the overall best matchset, consisting of one match from(More)
Permutation is a fundamental operator for array data, with applications in, for example, changing matrix layouts and reorganizing data cubes. We consider the problem of permuting large quantities of data stored on secondary storage that supports fast random block accesses, such as solid state drives and distributed key-value stores. Faster random accesses(More)
Log-structure merge (LSM) is an increasingly prevalent approach to indexing, especially for modern writeheavy workloads. LSM organizes data in levels with geometrically increasing sizes. Records enter the top level, whenever a level fills up, it is merged down into the next level. Hence, the index is updated only through merges and records are never updated(More)
Recent studies in classification have proposed ways of exploiting the association rule mining paradigm. These studies have performed extensive experiments to show their techniques to be both efficient and accurate. However, existing studies in this paradigm either do not provide any theoretical justification behind their approaches or assume independence(More)
  • 1