Learn More
This paper presents Warped-Compression, a warp-level register compression scheme for reducing GPU power consumption. This work is motivated by the observation that the register values of threads within the same warp are similar, namely the arithmetic differences between two successive thread registers is small. Removing data redundancy of register values(More)
This paper presents a cooperative heterogeneous computing framework which enables the efficient utilization of available computing resources of host CPU cores for CUDA kernels, which are designed to run only on GPU. The proposed system exploits at runtime the coarse-grain thread-level parallelism across CPU and GPU, without any source recompilation. To this(More)
This paper presents a cooperative heterogeneous computing framework which enables the efficient utilization of available computing resources of host CPU cores for CUDA kernels, which are designed to run only on GPU. The proposed system exploits at runtime the coarse-grain thread-level parallelism across CPU and GPU, without any source recompilation. To this(More)
Summary form only given. Speculative preexecution achieves efficient data prefetching by running additional prefetching threads on spare hardware contexts. Various implementations for speculative preexecution have been proposed, including compiler-based static approaches and hardware-based dynamic approaches. A static approach defines the p-thread at(More)
Mobile peer-to-peer (P2P) systems have recently got in the limelight of the research community that is striving to build efficient and effective mobile content addressable networks. Along this line of research, we propose a new peer-to-peer file sharing protocol suited to mobile ad hoc networks (MANET). The main ingredients of our protocol are network(More)
Graphics processors evolve rapidly and promise to support power-efficient, cost, differentiated price-performance, and scalable high performance computing. MapReduce is a well-known distributed programming model to ease the development of applications for large-scale data processing on a large number of commodity CPUs. When compared to CPUs, GPUs are an(More)
In this paper, we investigate parallel implementation techniques for network coding. It is known that network coding is useful for both wired and wireless networks and it also mitigates peer/piece selection problems in P2P file sharing systems. However, due to the decoding complexity of network coding, there have been concerns about adoption of network(More)
—As Internet and information technology have continued developing, the necessity for fast packet processing in computer networks has also grown in importance. All emerging network applications require deep packet classification as well as security-related processing and they should be run at line rates. Hence, network speed and the complexity of network(More)
Network coding is a well-known technique used to enhance network throughput and reliability by applying special coding to data packets. One critical problem in practice, when using the random linear network coding technique, is the high computational overhead. More specifically, using this technique in embedded systems with low computational power might(More)
—Vehicular ad hoc networks (VANET) aims to enhance vehicle navigation safety by providing an early warning system: any chance of accidents is informed through the wireless communication between vehicles. For the warning system to work, it is crucial that safety messages be reliably delivered to the target vehicles in a timely manner and thus reliable and(More)