Performance evaluation of OpenMP's target construct on GPUs - exploring compiler optimisations

@article{Hayashi2019PerformanceEO,
  title={Performance evaluation of OpenMP's target construct on GPUs - exploring compiler optimisations},
  author={Akihiro Hayashi and J. Shirako and Ettore Tiotto and R. Ho and Vivek Sarkar},
  journal={Int. J. High Perform. Comput. Netw.},
  year={2019},
  volume={13},
  pages={54-69}
}
OpenMP is a directive-based shared memory parallel programming model and has been widely used for many years. From OpenMP 4.0 onwards, GPU platforms are supported by extending OpenMP's high-level parallel abstractions with accelerator programming. This extension allows programmers to write GPU programs in standard C/C++ or Fortran languages, without exposing too many details of GPU architectures. However, such high-level programming models generally impose additional program optimisations on… Expand
5 Citations
A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload
  • PDF
Memory Efficient High-Performance Rotational Image Encryption
  • Raviraja Holla M, S. D
  • 2019 International Conference on Communication and Electronics Systems (ICCES)
  • 2019
Memory Efficient High-Performance Rotational Image Encryption

References

SHOWING 1-10 OF 26 REFERENCES
Exploring Compiler Optimization Opportunities for the OpenMP 4.× Accelerator Model on a POWER8+GPU Platform
  • 8
  • PDF
Compiling and Optimizing Java 8 Programs for GPU Execution
  • 42
  • PDF
Coordinating GPU Threads for OpenMP 4.0 in LLVM
  • 48
OpenMPC: Extended OpenMP Programming and Tuning for GPUs
  • S. Lee, R. Eigenmann
  • Computer Science
  • 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2010
  • 233
  • PDF
Performance analysis of OpenMP on a GPU using a CORAL proxy application
  • 28
Performance Analysis and Optimization of Clang's OpenMP 4.5 GPU Support
  • Matt Martineau, S. McIntosh-Smith, +10 authors Zehra Sura
  • Computer Science
  • 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)
  • 2016
  • 18
  • PDF
Optimized two-level parallelization for GPU accelerators using the polyhedral model
  • 9
Machine-Learning-based Performance Heuristics for Runtime CPU/GPU Selection
  • 22
  • PDF
...
1
2
3
...