Ninghan Zheng

Learn More
Helper threaded prefetching based on Chip Multiprocessor is a well known approach to reducing memory latency and has been explored in linked data structures accesses. However, conventional helper threaded prefetching often suffers from useless prefetches and cache thrashing, which affect its effectiveness. In this paper, we first analyzed the shortcomings(More)
The data needs of scientific or commercial applications from a diverse range of fields have been increasing exponentially over the recent years. Although the traditional systems work well for computation that requires limited data handling, the CMPs in cloud computing may below performance for the computation that requires large amounts of intensive data.(More)
Threaded prefetching based on Chip Multiprocessor (CMP) issues memory requests for data needed later by the main computation, and therefore may lead to increased stress on limited shared cache space and bus bandwidth. In our earlier work, we had proposed an effective threaded prefetching technique that selects proper prefetch distance for specific(More)
Helper threaded prefetching based on chip multiprocessor has been shown to reduce memory latency and improve overall system performance, and has been explored in linked data structures accesses. In our earlier work, we had proposed an effective threaded prefetching technique that balances delinquent loads between main thread and helper thread to improve(More)
With the rapid development of computer science and technology, Computer Aided Instruction(CAI) has being playing a more crucial role in modern teaching management and education itself. While in the teaching modules related to the art of programming, the position of human graders can be well taken by automated programming assignments graders such as Online(More)
Helper thread is a promising prefetching technique to bridge the memory wall on contemporary CMP platform. However, the synchronization between application and helper thread is important to the performance improvement. Previous research mainly focused on the loop-count based synchronization, and it is only suitable for the main thread which has enough(More)
At our university, since the Spring of 2005, we have been teaching a first course about data cache on CMP from computer architecture. We have accomplished several goals. The most important of which is the analysis and experimental approach for pull-based data prefetching and push-based data prefetching on CMP. The pedagogical style embodied in this course(More)