Exploitation of APL data parallelism on a shared-memory MIMD machine

@inproceedings{Ju1991ExploitationOA,
  title={Exploitation of APL data parallelism on a shared-memory MIMD machine},
  author={Roy Dz-Ching Ju and Wai-Mee Ching},
  booktitle={PPOPP '91},
  year={1991}
}
Programs written m APL implicitly contain data parallelism because the high level APL primitives denoting array o erations may be executed in parallel. Our experiment $ APL/C compiler translates ordinary APL programs into the C language with additional parallel constmcts for synchronization support. We target the RP3, a shared-memory MIMD machine built at IBM T.J. Watson Research Center, running the Mach operating system. The compiler uses Mach kernel prirmtives to build a parallel run-time… 

Figures and Tables from this paper

Execution of automatically parallelized APL programs on RP3
TLDR
An experimental APUC compiler is implemented, which accepts ordinary APL programs and produces C programs and a run-time environment that supports the parallel execution of these C programs on the RP3 computer, a sharedmemory, 64-way MIMD machine built at the IBM Thomas J. Watson Research Center.
Compiling nested data-parallel programs for shared-memory multiprocessors
TLDR
The design, implementation, and evaluation of an optimizing compiler are presented for an applicative nested data-parallel language called Vcode targeted at the Encore Multimax, a shared-memory multiprocessor.
On performance and space usage improvements for parallelized compiled APL code
TLDR
An analysis method by which the compiler can keep track of the change of the parallelism when combining high-level primitives is proposed, necessary when the compiler needs to decide a trade-off between more parallelism and a further combination.
Apex: the apl parallel executor
TLDR
Extensions to APL, including rank, cut, and a monadic operand for dyadic reduction, improve compiled and interpreted code performance.
Size and access inference for data-parallel programs
Data-parallel programming languages have many desirable features, such as single-thread semantics and the ability to express fine-grained parallelism. However, it is challenging to implement such
Implementing Data-Parallel Software on Dataflow Hardware
TLDR
A compiler and run-time system is written for a small data-parallel language targeted for EM-4, a hybrid dataaow/von Neumann computer that provides an interesting alternative MIMD parallel architecture because of its unusually good hardware support for communications and synchronization.
The key to a data parallel compiler
TLDR
It is demonstrated how an operator called the Key operator, that applies a function over groupings of array cells grouped and ordered by their keys, when used in conjunction with a Node Coordinate Matrix, permits arbitrary computation over sub-trees of an AST using purely data-parallel array programming techniques.
kappa-Project - First Step: To Improve Data Manipualtions and Representations on Parallel Computers
TLDR
This work developed a SIMD line-processor computer and a high-level programmation 's environment including a C-like Array Programming Language with Object-Oriented abilities in order to compute physical problems.
Accelerating information experts through compiler design
TLDR
The Co-dfns compiler project aims to reduce the overheads involved in creating high-performance code in APL, and focuses on integrating with the APL environment and compiles a familiar subset of the language, delivering significant performance and platform independence to information experts without requiring code rewrites and conversion into other languages.
...
1
2
...

References

SHOWING 1-10 OF 30 REFERENCES
Automatic parallelization of APL-style programs
TLDR
The APL/370 compiler the authors have been developing aims at implementing automatic parallelization of APL programs at basic block level and exploits functional parallelism on data independent sub-expressions and data parallelism of array primitives on array elements.
Automatic Parallelization of APL-Style Programs
TLDR
The APL/370 compiler the authors have been developing aims at implementing automatic parallelization of APL programs at basic block level and exploits functional parallelism on data independent sub-expressions and data parallelism of array primitives on array elements.
Program Analysis and Code Generation in an APL/370 Compiler
TLDR
An APL/370 compiler which accepts a subset of APL that Includes most language features and a majority of APl primitive functions is implemented, which removes the performance penalty ofAPL in computationintensive applications.
A VLIW architecture for a trace scheduling compiler
TLDR
Multiflow Computer, Inc., has now built a VLIW called the TRACETM along with its companion Trace SchedulingTM compacting compiler, and this new machine has fulfilled the performance promises that were made.
A VLIW architecture for a trace scheduling compiler
TLDR
Multiflow Computer, Inc., has now built a VLIW called the TRACETM along with its companion Trace SchedulingTM compacting compiler, and this new machine has fulfilled the performance promises that were made.
POSC—a partitioning and optimizing SISAL compiler
TLDR
The POSC compiler system described in this paper addresses two major issues in converting potential parallelism in SISAL programs into real speedup on multiprocessor systems by integrating previous work on efficient sequential implementation of SISal programs with previous workon selecting the useful parallelism.
Parallel Supercomputing Today and the Cedar Approach
TLDR
A wide range of algorithms and applications is being developed in an effort to provide high parallel processing performance in many fields, and new parallel supercomputer architectures are emerging that may provide rapid growth in performance.
Occamflow: A Methodology for Programming Multiprocessor Systems
Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers
TLDR
For certain types of loops, it is shown analytically that guided self-scheduling uses minimal overhead and achieves optimal schedules, and experimental results that clearly show the advantage of guidedSelfScheduling over the most widely known dynamic methods are discussed.
A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE
TLDR
Compilation of the programming language Id Nouveau into machine code for the MIT tagged-token dataflow architecture is thoroughly described and several common optimizing transformations are discussed.
...
1
2
3
...