Marcelo H. Cintra

Learn More
With speculative parallelization, code sections that cannot be fully analyzed by the compiler are aggressively executed in parallel. Hardware schemes are fast but expensive and require modifications to the processors and memory system. Software schemes require no extra hardware but can be inefficient.This paper proposes a new software-only speculative(More)
Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on single-chip multiprocessors (CMPs), whose effectiveness is necessarily limited by their small size. Very few schemes have attempted this technique in the context of scalable(More)
With speculative parallelization, code sections that cannot be fully analyzed by the compiler are optimistically executed in parallel. Hardware schemes are fast but expensive and require modifications to the processors and/or memory system. Software schemes require no changes to the hardware of existing shared-memory systems, but can suffer from significant(More)
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggressively executed in parallel. If the hardware detects a cross-thread dependence violation, it squashes offending threads and resumes execution. Unfortunately, frequent squashing cripples performance. This paper proposes a new framework of hardware mechanisms(More)
Speculative parallelization is a technique that complements automatic compiler parallelization by allowing code sections that cannot be fully analyzed by the compiler to be aggresively executed in parallel.However, while speculative parallelization can potentially deliver significant speedups, several overheads associated with the technique limit these(More)
Finding the fastest algorithm to solve a problem is one of the main issues in Computational Geometry. Focusing only on worst case analysis or asymptotic computations leads to the development of complex data structures or hard to implement algorithms. Randomized algorithms appear in this scenario as a very useful tool in order to obtain easier(More)
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggressively executed in parallel. If the hardware detects a cross-thread dependence violation, it squashes offending threads and resumes execution. Unfortunately, frequent squashing cripples performance. This paper proposes a new framework of hardware mechanisms(More)
ÐWhile hardware-coherent scalable shared-memory multiprocessors are relatively easy to program, they still require substantial programming effort to deliver high performance. Specifically, to minimize remote accesses, data must be carefully laid out in memory for locality and application working sets carefully tuned for caches. It has been claimed that this(More)
The developments in the field of computer architecture, especially parallel systems, lead to the design of even more complex architectures, making it difficult to take decisions that would increase the performance of the system. In order to analyze objectively the advantages of different architectural choices, it is important to have modeling and analysis(More)