Hongwen Dai

Learn More
This paper presents novel cache optimizations for massively parallel, throughput-oriented architectures like GPUs. L1 data caches (L1 D-caches) are critical resources for providing high-bandwidth and low-latency data accesses. However, the high number of simultaneous requests from single-instruction multiple-thread (SIMT) cores makes the limited capacity of(More)
—On-chip caches are commonly used in computer systems to hide long off-chip memory access latencies. To manage on-chip caches, either software-managed or hardware-managed schemes can be employed. State-of-art accelerators, such as the NVIDIA Fermi or Kepler GPUs and Intel's forthcoming MIC " Knights Landing " (KNL), support both software-managed caches,(More)
The high amount of memory requests from massive threads may easily cause cache contention and cache-miss-related resource congestion on GPUs. This paper proposes a simple yet effective performance model to estimate the impact of cache contention and resource congestion as a function of the number of warps/thread blocks (TBs) to bypass the cache. Then we(More)
Cultivars of hot pepper (Capsicum annuum L.) vary greatly in their fruit cadmium (Cd) concentration. Previously, we identified a low-Cd (YCT) and a high-Cd (JFZ) cultivar. In this study, we elucidated the physiological mechanisms resulting in the differences in their Cd accumulation. A time-dependent and concentration-dependent hydroponic experiment was(More)
  • 1