Trends in multicore DSP platforms

@article{Karam2009TrendsIM,
  title={Trends in multicore DSP platforms},
  author={Lina Karam and Ismail AlKamal and Alan Gatherer and Gene A. Frantz and D.V. Anderson and Brian L. Evans},
  journal={IEEE Signal Processing Magazine},
  year={2009},
  volume={26}
}
In the last two years, the embedded DSP market has been swept up by the general increase in interest in multicore that has been driven by companies such as Intel and Sun. One reason for this is that there is now a lot of focus on tooling in academia and also a willingness on the part of users to accept new programming paradigms. This industry-wide effort will have an effect on the way multicore DSPs are programmed and perhaps architected. But it is too early to say in what way this will occur… Expand
THE VANTAGE OF UTILIZING FPGA IN THE DESIGN OF AN EMBEDDED MULTIPROCESSOR
There are recent needs to design an embedded multiprocessor that will overcome the limitations in the performance of a uniprocessor. The deputation to achieve real time deadlines and overcome areaExpand
Supporting OpenMP on a multi-cluster embedded MPSoC
TLDR
This paper considers a representative template of a modern multi-cluster embedded MPSoC and presents an extensive evaluation of the cost associated with supporting OpenMP on such a machine, investigating several implementation variants that are aware of the memory hierarchy and of the heterogeneous interconnection. Expand
Acoherent shared memory
TLDR
This dissertation explores Acoherent Shared Memory (ASM), a new model that facilitates the transfer of semantic information from software to hardware and shows that the ASM model can lead to efficient implementations by designing and evaluating ASM-CMP, an initial design for single-chip multiprocessors. Expand
Sequential code parallelization for multi-core embedded systems: A survey of models, algorithms and tools
TLDR
This work aims to present a survey on some different aspects involved in parallelizing code such as models of code representation, code analysis, parallelism extraction algorithms, parallel programming, and existing parallelizing tools are presented and compared. Expand
Evaluating OpenMP Support Costs on MPSoCs
TLDR
An extensive evaluation of the cost associated with supporting OpenMP on such a machine is presented, investigating several implementative variants that efficiently exploit the memory hierarchy and results confirm the effectiveness of the optimizations in terms of performance improvements. Expand
Characterizing emerging heterogeneous memory
TLDR
A set of parallel benchmarks to characterize the performance and power efficiency of HM and a profiling tool to provide guidance for placing data in HM is developed, which is the first benchmark suite with OpenMP 4.0 features that is functional on real HM architectures. Expand
A memory-centric approach to enable timing-predictability within embedded many-core accelerators
TLDR
This paper studies how the predictable execution model (PREM), a memory-aware approach to enable timing-predictability in realtime systems, can be successfully adopted on multi- and manycore heterogeneous platforms. Expand
Search-based temporal testing of multicore applications
TLDR
Overall, the findings suggest that various forms of search-based approaches are effective in generating test inputs exhibiting extreme execution times on the embedded multicore environment, and the use of search to discover high performing biased random sampling regimes has proved particularly effective. Expand
A Toolflow for Parallelization of Embedded Software in Multicore DSP Platforms
TLDR
This paper presents a toolflow to guide developers in the process of programming multicore DSPs, and evaluates the applicability of the approach by parallelizing a set of realistic embedded benchmarks on a commercial multicoreDSP platform from Texas Instruments. Expand
Towards many core real-time embedded systems: software design of streaming systems at system level
TLDR
This thesis presents a model-of-computation based programming model which supports scalable specifications of a system in a parametrized manner and presents both analytic and simulation-based techniques to tackle the complex interference and correlations within multi/many-core embedded systems such that accurate estimation can be conducted. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 43 REFERENCES
Implementing OpenMP on a high performance embedded multicore MPSoC
TLDR
The initial experiences adapting OpenMP to enable it to serve as a programming model for high performance embedded systems and the needs of embedded application developers are discussed. Expand
SChISM: Scalable Cache Incoherent Shared Memory
TLDR
This paper motivates and describes a class of accelerator architectures that manage cache coherence in soft­ ware to exploit data sharing and communication characteristics present in emerging highly parallel workloads and demonstrates an implementation of the Rigel Task Model. Expand
The Landscape of Parallel Computing Research: A View from Berkeley
TLDR
The parallel landscape is frame with seven questions, and the following are recommended to explore the design space rapidly: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS each development dollar. Expand
X10: an object-oriented approach to non-uniform cluster computing
TLDR
A modern object-oriented programming language, X10, is designed for high performance, high productivity programming of NUCC systems and an overview of the X10 programming model and language, experience with the reference implementation, and results from some initial productivity comparisons between the X 10 and Java™ languages are presented. Expand
The problem with threads
For concurrent programming to become mainstream, we must discard threads as a programming model. Nondeterminism should be judiciously and carefully introduced where needed, and it should be explicitExpand
A 167-Processor Computational Platform in 65 nm CMOS
A 167-processor computational platform consists of an array of simple programmable processors capable of per-processor dynamic supply voltage and clock frequency scaling, three algorithm-specificExpand
Advances in hardware design and implementation of signal processing systems [DSP Forum]
This IEEE Signal Processing Magazine (SPM) forum discusses advances, challenges, and future trends in hardware design and implementation of signal processing (SP) systems. The invited forum membersExpand
THE SANDBRIDGE SANDBLASTER CONVERGENCE PLATFORM
As applications converge to multimedia systems, architectures must converge to support voice, data, and video applications. From a processor architecture perspective, support for signal processingExpand
Amdahl's Law in the Multicore Era
  • M. Hill
  • Computer Science
  • Computer
  • 2008
Augmenting Amdahl's law with a corollary for multicore hardware makes it relevant to future generations of chips with multiple processor cores. Obtaining optimal multicore performance will requireExpand
The sandbridge SB3011 SDR platform
TLDR
The Sandbridge Sandblaster real-time software defined radio platform is described with results for a number of interesting communications and multimedia systems including UMTS, DVB-H, WiMAX, WiFi, and NTSC video decoding. Expand
...
1
2
3
4
5
...