• Publications
  • Influence
GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks
With the emergence of data science, graph computing has become increasingly important these days. Unfortunately, graph computing typically suffers from poor performance when mapped to modernExpand
  • 93
  • 13
OpenCL Performance Evaluation on Modern Multi Core CPUs
Utilizing heterogeneous platforms for computation has become a general trend making the portability issue important. OpenCL (Open Computing Language) serves the purpose by enabling portable executionExpand
  • 26
  • 4
Performance Characterisation and Simulation of Intel's Integrated GPU Architecture
Integrated GPUs (iGPUs) are ubiquitous in today's client devices such as laptops and desktops. Examples include Intel's HD or Iris Graphics and AMD's APUs. An iGPU resides on the same chip as theExpand
  • 12
  • 2
CAIRO
Three-dimensional (3D)-stacking technology and the memory-wall problem have popularized processing-in-memory (PIM) concepts again, which offers the benefits of bandwidth and energy savings byExpand
  • 26
  • 2
CAIRO: A Compiler-Assisted Technique for Enabling Instruction-Level Offloading of Processing-In-Memory
Three-dimensional (3D)-stacking technology and the memory-wall problem have popularized processing-in-memory (PIM) concepts again, which offers the benefits of bandwidth and energy savings byExpand
  • 15
  • 1
Understanding Energy Aspects of Processing-near-Memory for HPC Workloads
Interests in the concept of processing-near-memory (PNM) have been reignited with recent improvements of the 3D integration technology. In this work, we analyze the energy consumption characteristicsExpand
  • 9
  • 1
A dual-rail voltage supply for battery powered portable devices
A positive and negative dual rail voltage supply proposed for battery powered portable devices and a test chip designed and fabricated by using 0.13um CMOS triple-well process. Proposed dual voltageExpand
  • 4
  • 1
Batch-Aware Unified Memory Management in GPUs for Irregular Workloads
While unified virtual memory and demand paging in modern GPUs provide convenient abstractions to programmers for working with large-scale applications, they come at a significant performance cost. WeExpand
  • 4
  • 1
Traversing large graphs on GPUs with unified memory
Due to the limited capacity of GPU memory, the majority of prior work on graph applications on GPUs has been restricted to graphs of modest sizes that fit in memory. Recent hardware and softwareExpand
  • 1
  • 1
Separate Extraction of Source, Drain, and Substrate Resistances in MOSFETs With Parasitic Junction Current Method
The separate extraction of asymmetric source (<i>R</i><sub>S</sub>) and drain (<i>R</i><sub>D</sub>) resistances caused by the variations in the layout, process, and device degradation is importantExpand
  • 11