Sriram R. Vangal

Learn More
A multicore processor in 65-nm technology with 80 single-precision, floating-point cores delivers performance in excess of a teraflops while consuming less than 100 w. A 2d on-die mesh interconnection network operating at 5 GHz provides the high-performance communication fabric to connect the cores. The network delivers a bisection bandwidth of 2.56(More)
This paper describes an integrated network-on-chip architecture containing 80 tiles arranged as an 8 10 2-D array of floating-point cores and packet-switched routers, both designed to operate at 4 GHz. Each tile has two pipelined single-precision floating-point multiply accumulators (FPMAC) which feature a single-cycle accumulation loop for high throughput.(More)
This paper describes a multi-core processor that integrates 48 cores, 4 DDR3 memory channels, and a voltage regulator controller in a 6 4 2D-mesh network-on-chip architecture. Located at each mesh node is a five-port virtual cut-through packet-switched router shared between two IA-32 cores. Core-to-core communication uses message passing while exploiting(More)
The number of cores integrated onto a single die is expected to climb steadily in the foreseeable future. This move to many-core chips is driven by a need to optimize performance per watt. How best to connect these cores and how to program the resulting many-core processor, however, is an open research question. Designs vary from GPUs to cache-coherent(More)
The maximum operating frequency (Fmax) of a processor is traditionally set at a constant value based on the maximum operating temperature (Tmax) and supply voltage (Vcc) droops. In addition, the expected slowdown of the operating speed during processor lifetime due to transistor aging is used as a guardband for setting the Fmax. The maximum operating(More)