Sai Rahul Chalamalasetti

Learn More
Providing low-latency access to large amounts of data is one of the foremost requirements for many web services. To address these needs, systems such as Memcached have been created which provide a distributed, all in-memory key-value store. These systems are critical and often deployed across hundreds or thousands of servers. However, these systems are not(More)
We present the architecture and programming model for MORA, a coarse-grained reconfigurable processor aimed at multimedia applications. The MORA architecure is a MIMD machine consisting of a 2-D array of reconfigurable cells (RC) with a flexible reconfigurable interconnect network. MORA is designed to support high-throughput data-parallel pipelined(More)
This paper presents an architecture and implementation details for MORA, a novel coarse grained reconfigurable processor for accelerating media processing applications. The MORA architecture involves a 2-D array of several such processors, to deliver low cost, high throughput performance in media processing applications. A distinguishing feature of the MORA(More)
Emerging data-centric workloads that operate on and harvest useful insights from large amounts of unstructured data require corresponding new data-centric system architecture optimizations. In particular, with the growing importance of power and cooling costs, a key challenge for such future designs is to achieve increased performance at high energy(More)
This letter presents the design and evaluation of a coarse grained reconfigurable array, hardened against radiation induced transient errors. The architecture consists of an 8 × 8 array of reconfigurable cells, each provided with a built-in soft error detection and instruction roll-back control. We also present the communication management scheme(More)
This paper presents new power efficient high throughput data paths for portable multimedia devices. The various data paths provide support for dense arithmetic operations. This work provides the performance evaluation for a library of reconfigurable data path elements (Processing Elements) previously proposed and presents two new processing element(More)
This work presents an effort to bridge the gap between abstract high level programming and OpenCL by extending an existing high level Java programming framework (APARAPI), based on OpenCL, so that it can be used to program FPGAs at a high level of abstraction and increased ease of programmability. We run several real world algorithms to assess the(More)
MORA is a novel platform for high-level FPGA programming of streaming vector and matrix operations, aimed at multimedia applications. It consists of soft array of pipelined low-complexity SIMD processors-in-memory (PIM). We present a Domain-Specific Language (DSL) for high-level programming of the MORA soft processor array. The DSL is embedded in C++,(More)