Ericles Rodrigues Sousa

Learn More
Nowadays, computer vision algorithms have countless application domains. On the one hand, these algorithms are typically computationally demanding, on the other hand, they are often used in embedded systems, which have stringent constraints on, e. g., size or power. In this work, we present the benefits of mapping compute-intensive imaging algorithms on(More)
Increasing power densities have led to the dark silicon era, for which heterogeneous multicores with different power and performance characteristics are promising architectures. This paper focuses on maximizing the overall system performance under a critical temperature constraint for heterogeneous tiledmulticores, where all cores or accelerators inside a(More)
This paper presents an integrated and coordinated cross-layer sensing and optimization flow for distributed dark silicon management for tiled heterogeneous manycores under a critical temperature constraint. We target some of the key challenges in dark silicon for manycores, such as: directly focusing on power density/temperature instead of considering(More)
Multiprocessor system-on-chip (MPSoC) designs offer a lot of computational power assembled in a compact design. The computing power of MPSoCs can be further augmented by adding massively parallel processor arrays (MPPA) and specialized hardware with instruction-set extensions. On-chip MPPAs can be used to accelerate low-level image-processing algorithms(More)
Optical flow is widely used in many applications of portable mobile devices and automotive embedded systems for the determination of motion of objects in a visual scene. Also in robotics, it is used for motion detection, object segmentation, time-to-contact information, focus of expansion calculations, robot navigation, and automatic parking for vehicles.(More)
This paper describes a runtime reconfigurable bus arbitration technique for concurrent applications on heterogeneous MPSoC architectures. Here, a hardware/software approach is introduced as part of a runtime framework that enables selecting and adapting different policies (i. e., fixed-priority, TDMA, and Round-Robin) such that the performance goals of(More)
Continuous software and hardware innovations impose on the one hand a high degree of flexibility from an algorithm and on the other hand it requires that a given processing architecture has the capability to adapt to changing computation patterns at run-time. In this work, we demonstrate how a computer vision application can adapt itself at runtime in order(More)
Recently, DSP and FPGA devices have been employed in cooperative computing architectures for embedded systems which has required high complexity computing process which in turn demanded efficient techniques capable of measuring the total effective gain of the partitioning of a given code. In order to meet that need and based on Amdahl's Law a mathematical(More)
Continuous technology miniaturization allows to build massively parallel embedded computer architectures within a single silicon chip. Programming that leverages the abundant parallelism in such architectures, however, is very difficult, tedious, and error-prone. Thus, compiler support is paramount. We therefore present LoopInvader, a loop compiler for a(More)
Massively Parallel Processor Arrays (MPPAs) can be nicely used in portable devices such as tablets and smartphones. However, applications running on mobile platforms require a certain performance level or quality (e.g., high-resolution image processing) that need to be satisfied while adhering to a certain power budget and temperature threshold. As a(More)