Paul Keltcher

Sparse matrix-vector multiplication (SpMV) is the core operation in many common network and graph analytics, but poor performance of the SpMV kernel handicaps these applications. This work quantifies the effect of matrix structure on SpMV performance, using Intel's VTune tool for the Sandy Bridge architecture. Two types of sparse matrices are considered:
Microprocessors have evolved over the last forty-plus years from purely sequential single operation machines, to pipelined super-scalar, to threaded and SIMD, and finally to multi-core and massive multi-core/thread machines. Despite these advances, the conceptual model programmers use to program them is still that of a single threaded register file bound
Recent architectures in academia and industry have explored placing multiple processors on a single chip, but a consensus has not emerged on the memory architecture. The recent availability of embedded DRAM (EDRAM) has further complicated the formula. In this investigation, we present a new and comprehensive comparison of four very different memory
