Richard A. Hankins

Learn More
Graphs are widely used to model real world objects and their relationships, and large graph datasets are common in many application domains. To understand the underlying characteristics of large graphs, graph summarization techniques are critical. However, existing graph summarization methods are mostly statistical (studying statistics such as degree(More)
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a common and critical operation in many of these applications. The suffix tree is a versatile data structure that can be used to evaluate a wide variety of queries on sequence datasets, including evaluating exact and approximate string matches, and finding repeat(More)
In main-memory databases, the number of processor cache misses has a critical impact on the performance of the system. Cache-conscious indices are designed to improve performance by reducing the number of processor cache misses that are incurred during a search operation. Conventional wisdom suggests that the index's node size should be equal to the cache(More)
The number of processor cache misses has a critical impact on the performance of DBMSs running on servers with large main-memory configurations. In turn, the cache utilization of database systems is highly dependent on the physical organization of the records in main-memory. A recently proposed storage model, called PAX, was shown to greatly improve the(More)
Recent studies have shown that most SPEC CPU2K benchmarks exhibit strong phase behavior, and the Cycles per Instruction (CPI) performance metric can be accurately predicted based on program's control-flow behavior, by simply observing the sequencing of the program counters, or extended instruction pointers (EIPs). One motivation of this paper is to see if(More)
On-Line Transaction Processing (OLTP) workloads are crucial benchmarks for the design and analysis of server processors. Typical cached configurations used by researchers to simulate OLTP workloads are orders of magnitude smaller than the fully scaled configurations used by OEM vendors to achieve world-record transaction processing throughput. The objective(More)
Microprocessor design is undergoing a major paradigm shift towards multi-core designs, in anticipation that future performance gains will come from exploiting threadlevel parallelism in the software. To support this trend, we present a novel processor architecture called the Multiple Instruction Stream Processing (MISP) architecture. MISP introduces the(More)
Much research focuses on predicting a person's geo-spatial traversal patterns using a history of recorded geo-coordinates. In this paper, we focus on the problem of predicting <i>location-state</i> transitions. Location-states for a user refer to a set of anchoring points/regions in space, and the prediction task produces a sequence of predicted location(More)
We introduce our work on the use of prediction markets as a tool for mining knowledge from within an enterprise (i.e., Nokia Corporation). Various companies – including Google, HP, and Yahoo – have experimented with internal prediction market services, testing its utility for tapping knowledge distributed and nascent within their organization. We are also(More)