Learn More
This paper presents novel cache optimizations for massively parallel, throughput-oriented architectures like GPUs. L1 data caches (L1 D-caches) are critical resources for providing high-bandwidth and low-latency data accesses. However, the high number of simultaneous requests from single-instruction multiple-thread (SIMT) cores makes the limited capacity of(More)
On-chip caches are commonly used in computer systems to hide long off-chip memory access latencies. To manage on-chip caches, either software-managed or hardware-managed schemes can be employed. State-of-art accelerators, such as the NVIDIA Fermi or Kepler GPUs and Intel's forthcoming MIC “Knights Landing” (KNL), support both software-managed(More)
Cultivars of hot pepper (Capsicum annuum L.) differ widely in their fruit cadmium (Cd) concentrations. Previously, we suggested that low-Cd cultivars are better able to prevent the translocation of Cd from roots to aboveground parts, but the corresponding mechanisms are still unknown. In this study, we aimed to improve understanding of the root(More)
A pot experiment was conducted to investigate the stability of Cd and/or Pb accumulation in shoot of Cd and Pb pollution-safe cultivars (PSCs), the hereditary pattern of shoot Cd accumulation, and the transfer potentials of Cd and Pb in water spinach (Ipomoea aquatica Forsk.). A typical Cd-PSC, a typical non-Cd-PSC (Cd accumulative cultivar), a hybrid from(More)
Cultivars of hot pepper (Capsicum annuum L.) have different abilities to accumulate Cd in their fruits. Previously, we suggested that low-Cd cultivars take up more Cd, but can better prevent the Cd translocation from roots to aerial parts. However, the mechanisms involved in those processes are still unclear. In this study, we explored the roles of(More)
The high amount of memory requests from massive threads may easily cause cache contention and cache-miss-related resource congestion on GPUs. This paper proposes a simple yet effective performance model to estimate the impact of cache contention and resource congestion as a function of the number of warps/thread blocks (TBs) to bypass the cache. Then we(More)
This paper describes a 3D computer architecture designed to achieve the lowest possible power consumption for “embedded applications” like radar and signal processing. It introduces several unique concepts including a low-power SIMD tile, low-power 3D memories, and 3D and 2.5D interconnect that is circuit switched so it can be tuned at run-time for a(More)
A pot experiment was conducted to investigate the translocation of cadmium (Cd) and lead (Pb) and assess the safety of edible parts in two cultivars of water spinach (Ipomoea aquatica Forsk.) contrasting in shoot Cd and Pb concentrations. A low-Cd-Pb cultivar (QLQ) and a high-Cd-Pb cultivar (T308) were grown in five soils with different concentrations of Cd(More)
Cadmium (Cd) contamination in agricultural products presents a threat to humans when consumed. Sweet potato is the world's seventh most important food crop. The aims of this study were to screen for low-Cd sweet potato cultivars and clarify the mechanisms of low-Cd accumulation in edible roots. A pot experiment was conducted to investigate the variation of(More)
Caches are universally used in computing systems to hide long off-chip memory access latencies. Unlike CPUs, massive threads running simultaneously on GPUs bring a tremendous pressure on memory hierarchy. As a result, the limitation of cache resources becomes a bottleneck for a GPU to exploit thread-level parallelism (TLP) and memory-level parallelism (MLP)(More)