Xiaoya Fan

Learn More
Multi-operand adder is one of attractive solutions compared with a network of 2-operand adders for accelerating algorithms including a lot of addition operations. In this paper, an improved 3-operand floating-point (FP) adder has been presented. Firstly, the internal width of the adder has been given which is compatible with IEEE-Std754. Secondly, a(More)
Aggressive prefetching may cause much inter-core interference and lead to large performance in shared memory CMP systems. The paper aims at improving system performance and making prefetching effective. We study prefetching-caused inter-core interference of CMP system and propose a Global Prefetcher Aggressiveness Control Scheme (GPACS) to reduce useless(More)
—with the advent of chip multiprocessor (CMP) architecture, programmer must tune the program to the architecture in order to fully utilize the hardware resource. How to parallel program multimedia application in the CMP is a big obstacle. In this paper, we introduce the potential parallelism in the multimedia application and the multi-grain parallelism(More)
In shared-memory Chip Multiprocessor (CMP), shared data between different cores must be exchanged through the last-level-shared-cache and cache coherence must be maintained at the same time. As the number of cores increase, the cache coherence wall has become more and more serious. As for the multimedia applications full of streaming-like data, existing(More)
Due to technological parameters and constraints entailed in many-core processor with shared memory systems, it demands new solutions to the cache coherence problem. Directory-based coherence protocols have recently seemed as a possible scalable alternative for CMP designs. Unfortunately, with the number of on-chip cores increasing, many directory design(More)