FastFlow: high-level and efficient streaming on multi-core∗ (A FastFlow short tutorial)


Computer hardware manufacturers have moved decisively to multi-core and are currently experimenting with increasingly advanced many-core architectures. In the long term, writing efficient, portable and correct parallel programs targeting multiand many-core architectures must become no more challenging than writing the same programs for sequential computers. To date, however, most applications running on multicore machines do not exploit fully the potential of these architectures. This situation is in part due to the scarcity of good high-level programming tools suitable for multi/manycore architectures, and in part to the fact that multi-core programming is still viewed as a kind of exotic branch of high-performance computing (HPC) rather than being perceived as the de facto standard programming practice for the masses. Some efforts have been made to provide programmers with tools suitable for mapping data parallel computations onto both multi-cores and GPUs–the most popular many-core currently available. Tools have also been developed to support stream parallel computations [34, 31] as stream parallelism de facto represents a pattern characteristic of a large class of (potentially) parallel applications. Two major issues with these programming environments and tools relate to programmability and efficiency. Programmability is often impaired by the modest level of abstraction provided to the programmer. Efficiency more generally suffers from the peculiarities related to effective exploitation of the memory hierarchy. As a consequence, two distinct but synergistic needs exist: on the one hand, increasingly efficient mechanisms supporting correct concurrent access to shared memory data structures are needed; on the other hand there is a need for higher level programming environments capable of hiding the difficulties related to the correct and efficient use of shared memory objects by raising the level of abstraction provided to application programmers. To address these needs we introduce and discuss FastFlow, a programming framework specifically targeting cache-coherent shared-memory multi-cores. FastFlow is implemented as a stack of C++ template libraries. The lowest layer of FastFlow provides very efficient lock-free (and memory fence free) synchronization base mechanisms. The middle layer provides distinctive communication mechanisms supporting both single producer-multiple consumer and multiple producer-single consumer communications. These

10 Figures and Tables

Citations per Year

86 Citations

Semantic Scholar estimates that this publication has 86 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Aldinucci2011FastFlowHA, title={FastFlow: high-level and efficient streaming on multi-core∗ (A FastFlow short tutorial)}, author={Marco Aldinucci and Marco Danelutto and Peter Kilpatrick and Massimo Torquati}, year={2011} }