Muhamed F. Mudawar

Learn More
CMP processors are already replacing complex single core superscalar processor architectures. They offer better performance per watt and area. This is especially true in TLP rich server and web applications. Process / thread synchronization is important since CMP consists of multiple processor cores sharing cache resources including shared data structures.(More)
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past decade. Instead, larger unified L2 and L3 caches were introduced. This cache hierarchy has a high overhead due to the principle of containment, as all the cache blocks in the upper(More)
  • 1