Learn More
Modern compilers are responsible for adapting the semantics of source programs into a form that makes efficient use of a highly complex, heterogeneous machine. This adaptation amounts to solve an optimization problem in a huge and unstructured search space, while predicting the performance outcome of complex sequences of program transformations. The(More)
1 2 Harm Munk et al. Streaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded applications and DSPs. However, programming efficiently for streaming architectures is a challenging task, having to(More)
The most popular I/O virtualization method today is paravirtual I/O. Its popularity stems from its reasonable performance levels while allowing the host to interpose, i.e., inspect or control, the guest's I/O activity. We show that paravirtual I/O performance still significantly lags behind that of state-of-the-art non-interposing I/O virtualization, SRIOV.(More)
The performance of many parallel applications relies not on instruction-level parallelism but on loop-level parallelism. Unfortunately, automatic parallelization of loops is a fragile process, many different obstacles affect or prevent it in practice. To address this predicament we developed an interactive compilation feedback system that guides programmers(More)
Streaming applications are based on a data-driven approach where compute components consume and produce unbounded data vectors. Streaming oriented systems have become dominant in a wide range of domains, including embedded applications and DSPs However, programming efficiently for streaming architectures is a very challenging task, having to carefully(More)
The traditional "trap and emulate" I/O paravirtualization model conveniently allows for I/O interposition, yet it inherently incurs costly guest-host context switches. The newer "sidecore" model eliminates this overhead by dedicating host (side)cores to poll the relevant guest memory regions and react accordingly without context switching. But the(More)
Helping programmers write parallel software is an urgent problem given the popularity of multi-core architectures. Engineering compilers which automatically parallelize and vectorize code has turned out to be very challenging. Compilers are not powerful enough to exploit all opportunities for optimization in a code fragment – rather, they are selective with(More)
Hypervisors implement useful features such as live migration and software-defined networking by interpos-ing on their guest virtual machines' I/O activity. Unfortunately , this interposition significantly reduces performance and scalability due to competition for resources between multiple guests and costly host/guest context switches. We present an(More)
Para-virtualization is the leading approach in IO device virtualization. It allows the hypervisor to interpose on and inspect a virtual machine's I/O traffic at run-time. Examples of such interfaces are KVM's virtio [6] and VMWare's VMXNET [7]. Current implementations of virtual I/O in the hypervisor have been shown to have performance and scalability(More)
  • 1