Riposte: A trace-driven compiler and parallel VM for vector code in R

@article{Talbot2012RiposteAT,
  title={Riposte: A trace-driven compiler and parallel VM for vector code in R},
  author={Justin Talbot and Zach DeVito and Pat Hanrahan},
  journal={2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT)},
  year={2012},
  pages={43-51}
}
  • Justin Talbot, Zach DeVito, P. Hanrahan
  • Published 19 September 2012
  • Computer Science
  • 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT)
There is a growing utilization gap between modern hardware and modern programming languages for data analysis. Due to power and other constraints, recent processor design has sought improved performance through increased SIMD and multi-core parallelism. At the same time, high-level, dynamically typed languages for data analysis have become popular. These languages emphasize ease of use and high productivity, but have, in general, low performance and limited support for exploiting hardware… Expand
Accelerating Dynamically-Typed Languages on Heterogeneous Platforms Using Guards Optimization
TLDR
The present paper presents MegaGuards, a new approach for speculatively executing dynamic languages on heterogeneous platforms in a fully automatic and transparent manner, which removes guards from compute-intensive loops and improves sequential performance. Expand
Just-In-Time GPU Compilation for Interpreted Languages with Partial Evaluation
TLDR
This paper uses just-in-time compilation to transparently and automatically offload computations from interpreted dynamic languages to heterogeneous devices and shows that when taking into account start-up time, large speedups are achievable, even when the applications run for as little as a few seconds. Expand
Parallelizing Julia with a Non-Invasive DSL
TLDR
ParallelAccelerator is presented, a library and compiler for high-level, high-performance scientific computing in Julia that exposes the implicit parallelism in high- level array-style programs and compiles them to fast, parallel native code. Expand
Optimizing R VM: Allocation Removal and Path Length Reduction via Interpreter-level Specialization
TLDR
A first classification of R programming styles into Type I (looping over data), Type II (vector programming), and Type III (glue codes) is introduced, which shows the most serious overhead of R are mostly manifested on Type I R codes, whereas many Type III R codes can be quite fast. Expand
Just-in-time Length Specialization of Dynamic Vector Code
TLDR
A trace-based just-in-time compilation strategy that performs partial length specialization of dynamically typed vector code to avoid excessive compilation overhead while still enabling the generation of efficient machine code through length-based optimizations. Expand
Dynamic page sharing optimization for the R language
TLDR
This work presents a low-overhead page sharing approach for R that significantly reduces the interpreter's memory overhead and Concentrating on the most rewarding optimizations avoids the high runtime overhead of existing generic approaches for memory deduplication or compression. Expand
Dynamic page sharing optimization for the R language
TLDR
This work presents a low-overhead page sharing approach for R that significantly reduces the interpreter's memory overhead and Concentrating on the most rewarding optimizations avoids the high runtime overhead of existing generic approaches for memory deduplication or compression. Expand
ROSA: R Optimizations with Static Analysis
TLDR
ROSA is presented, a static analysis framework to improve the performance and space efficiency of R programs and shows substantial reductions by ROSA in execution time and memory consumption over both CRAN R and Microsoft R Open. Expand
Contextual dispatch for function specialization
TLDR
This paper proposes an approach to further the specialization of dynamic language compilers, by disentangling classes of behaviors into separate optimization units, and describes a compiler for the R language which uses this approach. Expand
Optimizing R language execution via aggressive speculation
TLDR
Novel optimizations backed up by aggressive speculation techniques and implemented within FastR, an alternative R language implementation, utilizing Truffle -- a JVM-based language development framework developed at Oracle Labs are described. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 42 REFERENCES
Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language
TLDR
This paper introduces Intel® Array Building Blocks (ArBB), which is a retargetable dynamic compilation framework that focuses on making it easier to write and port programs so that they can harvest data and thread parallelism on both multi-core and heterogeneous many-core architectures, while staying within standard C++. Expand
ispc: A SPMD compiler for high-performance CPU programming
  • M. Pharr, W. Mark
  • Computer Science
  • 2012 Innovative Parallel Computing (InPar)
  • 2012
TLDR
A compiler, the Intel R® SPMD Program Compiler (ispc), is developed that delivers very high performance on CPUs thanks to effective use of both multiple processor cores and SIMD vector units. Expand
Compiling for stream processing
TLDR
A compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage, and is able to overlap memory operations and manage local storage so that 78% to 96% of program execution time is spent in running computational kernels. Expand
Copperhead: compiling an embedded data parallel language
TLDR
The language, compiler, and runtime features that enable Copperhead to efficiently execute data parallel code are discussed and the program analysis techniques necessary for compiling Copperhead code into efficient low-level implementations are introduced. Expand
HotpathVM: an effective JIT compiler for resource-constrained devices
TLDR
A just-in-time compiler for a Java VM that is small enough to fit on resource-constrained devices, yet is surprisingly effective, and benchmarks show a speedup that in some cases rivals heavy-weight just- in-time compilers. Expand
Harnessing the Multicores: Nested Data Parallelism in Haskell
TLDR
Data Parallel Haskell is described, which embodies nested data parallelism in a modern, general-purpose language, implemented in a state-of-the-art compiler, GHC, which focuses particularly on the vectorisation transformation, which transforms nested to flat data Parallel Haskell. Expand
Dynamo: a transparent dynamic optimization system
We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes onExpand
Dynamo: a transparent dynamic optimization system
We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes onExpand
Lazy binary-splitting: a run-time adaptive work-stealing scheduler
TLDR
Lazy Binary Splitting is presented, a user-level scheduler of nested parallelism for shared-memory multiprocessors that builds on existing Eager binary Splitting work-stealing, but improves performance and ease-of-programming. Expand
Scalable aggregation on multicore processors
TLDR
This paper aims to provide a solution to performing in-memory parallel aggregation on the Intel Nehalem architecture, and considers several previously proposed techniques, including a hybrid independent/shared method and a method that clones data items automatically when contention is detected. Expand
...
1
2
3
4
5
...