Ram Huggahalli

Learn More
Recent I/O technologies such as PCI-Express and 10Gb Ethernet enable unprecedented levels of I/O bandwidths in mainstream platforms. However, in traditional architectures, memory latency alone can limit processors from matching 10 Gb inbound network I/O traffic. We propose a platform-wide method called Direct Cache Access (DCA) to deliver inbound I/O data(More)
10GbE connectivity is expected to be a standard feature of server platforms in the near future. Among the numerous methods and features proposed to improve network performance of such platforms is Direct Cache Access (DCA) to route incoming I/O to CPU caches directly. While this feature has been shown to be promising, there can be significant challenges(More)
Adoption of the 10GbE Ethernet standard as a high performance interconnect has been impeded by two important performance-oriented considerations: (1) processing requirements of common protocol stacks and (2) end-to-end latency. The overheads of typical software based protocol stacks on CPU utilization and throughput have been well evaluated in several(More)
The Intel® Omni-Path Architecture (Intel® OPA) is designed to enable a broad class of computations requiring scalable, tightly coupled CPU, memory, and storage resources. Integration between devices in the Intel® OPA family and Intel® CPUs enable improvements in system level packaging and network efficiency. When coupled with the new user-focused open(More)
As platforms evolve from employing single-threaded, single-core CPUs to multi-threaded, multi-core CPUs and embedded hardware-assist engines, the simulation infrastructure required for performance analysis of these platforms becomes extremely complex. While investigating hardware/software solutions for server network acceleration (SNA), we encountered(More)
With the rapid evolution of network speed from 1Gbps to 10Gbps, a wide spectrum of research has been done on TCP/IP to improve its processing efficiency on general purpose processors. However, most of them did studies only from the performance perspective and ignored its power efficiency. As power has become a major concern in data centers, where servers(More)
Scaling TCP/IP receive side processing to 10Gbps speeds on commercialserver platforms has been a major challenge. This led to the development oftwo key techniques: Large Receive Offload (LRO) and Direct Cache Access(DCA). Only recently, systems supporting these two techniques have becomeavailable. So, we want to evaluate these two techniques using 10Gigabit(More)