Petri net versus modulo scheduling for software pipelining


Untolerated load instruction latencies often have a significant impact on overall program performance. As one means of mitigating this effect we present an aggressive hardware-based mechanism that provides effective support for reducing the latency of load instructions. Through the judicious use of instruction predecode, base register caching, and fast… (More)
DOI: 10.1145/225160.225179