Xiaowei Jiang

Learn More
As manycore architectures enable a large number of cores on the die, a key challenge that emerges is the availability of memory bandwidth with conventional DRAM solutions. To address this challenge, integration of large DRAM caches that provide as much as 5× higher bandwidth and as low as 1/3rd of the latency (as compared to conventional DRAM) is(More)
As transistor density continues to grow at an exponential rate in accordance to Moore's law, the goal for many Chip Multi-Processor (CMP) systems is to scale the number of on-chip cores proportionally. Unfortunately, off-chip memory bandwidth capacity is projected to grow slowly compared to the desired growth in the number of cores. This creates a situation(More)
Chip Multi-Processor (CMP) architectures have recently become a mainstream computing platform. Recent CMPs allow cores to share expensive resources, such as the last level cache and off-chip pin bandwidth. To improve system performance and reduce the performance volatility of individual threads, last level cache and off-chip bandwidth partitioning schemes(More)
Over the last decade, homogeneous multi-core processors emerged and became the de-facto approach for offering high parallelism, high performance and scalability for a wide range of platforms. We are now at an interesting juncture where several critical factors (smaller form factor devices, power challenges, need for specialization, etc) are guiding(More)
The goal of this paper is to propose a scheme that provides comprehensive security protection for the heap. Heap vulnerabilities are increasingly being exploited for attacks on computer programs. In most implementations, the heap management library keeps the heap meta-data (heap structure information) and the application's heap data in an interleaved(More)
In current Chip-multiprocessors (CMPs), a significant portion of the die is consumed by the last-level cache. Until recently, the balance of cache and core space has been primarily guided by the needs of single applications. However, as multiple applications or virtual machines (VMs) are consolidated on such a platform, researchers have observed that not(More)
......Today’s multicore processors already integrate multiple cores on a die. Many-core architectures enable far more small cores for throughput computing. The key challenge in many-core architectures is the memory bandwidth wall: the required memory bandwidth to keep all cores running smoothly is a significant challenge. In the past, researchers have(More)
The integration of the .NET Common Language Runtime (CLR) inside the SQL Server DBMS enables database programmers to write business logic in the form of functions, stored procedures, triggers, data types, and aggregates using modern programming languages such as C#, Visual Basic, C++, COBOL, and J++. This paper presents three main aspects of this work.(More)
Bulk memory copying and initialization is one of the most ubiquitous operations performed in current computer systems by both user applications and Operating Systems. While many current systems rely on a loop of loads and stores, there are proposals to introduce a single instruction to perform bulk memory copying. While such an instruction can improve(More)
Inability to hide main memory latency has been increasingly limiting the performance of modern processors. The problem is worse in large-scale shared memory systems, where remote memory latencies are hundreds, and soon thousands, of processor cycles. To mitigate this problem, we propose an intelligent memory and cache coherence controller (AMC) that can(More)