Learn More
Machine learning and data mining are gaining increasing attentions of the computing society. FPGA provides a highly parallel, low power, and flexible hardware platform for this domain, while the difficulty of programming FPGA greatly limits its prevalence. MapReduce is a parallel programming framework that could easily utilize inherent parallelism in(More)
We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. The key components are a custom-built supercomputer dedicated to deep learning, a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation(More)
Cloud computing is becoming a major trend for delivering and accessing infrastructure on demand via the network. Meanwhile, the usage of FPGAs (Field Programmable Gate Arrays) for computation acceleration has made significant inroads into multiple application domains due to their ability to achieve high throughput and predictable latency, while providing(More)
Approximate computing is a promising design paradigm for better performance and power efficiency. In this paper, we propose a power efficient framework for analog approximate computing with the emerging metal-oxide resistive switching random-access memory (RRAM) devices. A programmable RRAM-based approximate computing unit (RRAM-ACU) is introduced first to(More)
Domain of stereo vision is highly important in the fields of autonomous cars, video tolling, robotics, and aerial surveys. The specific feature of this domain is that we should handle not only the pixel-by-pixel 2D processing in one image but also the 3D processing for depth estimation by comparing information about a scene from several images with(More)
—This paper presents an FPGA based stereo vision system for future video tolling, which can achieve real-time processing for high resolution video streams. The key component for the system is SAD (Sum of Absolute Difference) based stereo matching. Although simple and effective, this method usually needs much computation power to satisfy real-time(More)
The research on complex Brain Networks plays a vital role in understanding the connectivity patterns of the human brain and disease-related alterations. Recent studies have suggested a noninvasive way to model and analyze human brain networks by using multi-modal imaging and graph theoretical approaches. Both the construction and analysis of the Brain(More)
In many types of tumours, especially pancreatic adenocarcinoma, miR-301a is over-expressed. This over-expression results in negative regulation of the target gene of miR-301a, the nuclear factor-κB (NF-κB) repressing factor (NKRF), increasing the activation of NF-κB and production of NF-κB-responsive pro-inflammatory cytokines such as interleukin-8,(More)
The cessation of Moore's Law has limited further improvements in power efficiency. In recent years, the physical realization of the memristor has demonstrated a promising solution to ultra-integrated hardware realization of neural networks, which can be leveraged for better performance and power efficiency gains. In this work, we introduce a power efficient(More)
Google's famous PageRank algorithm is widely used to determine the importance of web pages in search engines. Given the large number of web pages on the World Wide Web, efficient computation of PageRank becomes a challenging problem. We accelerated the power method for computing PageRank on AMD GPUs. The core component of the power method is the Sparse(More)