Zoltan Majo

Learn More
Multiprocessors based on processors with multiple cores usually include a non-uniform memory architecture (NUMA); even current 2-processor systems with 8 cores exhibit non-uniform memory access times. As the cores of a processor share a common cache, the issues of memory management and process mapping must be revisited. We find that optimizing only for data(More)
Many recent multicore multiprocessors are based on a nonuniform memory architecture (NUMA). A mismatch between the data access patterns of programs and the mapping of data to memory incurs a high overhead, as remote accesses have higher latency and lower throughput than local accesses. This paper reports on a limit study that shows that many scientific(More)
Future exascale systems will be based on multi-core processors , but even today's multi-core processors can be asymmetric and exhibit limitations and bottlenecks that are different from those found on a symmetric multipro-cessor. In this paper we investigate the performance of a cluster node based on the Intel Xeon E5345 quad-core processor and note that(More)
Many recent multiprocessor systems are realized with a non-uniform memory architecture (NUMA) and accesses to remote memory locations take more time than local memory accesses. Optimizing NUMA memory system performance is difficult and costly for three principal reasons: (1) today's programming languages/libraries have no explicit support for NUMA(More)
Many multicore multiprocessors have a non-uniform memory architecture (NUMA), and for good performance, data and computations must be partitioned so that (ideally) all threads execute on the processor that holds their data. However, many multithreaded applications show heavy use of shared data structures that are accessed by all threads of the application.(More)
—An important aspect of workload characterization is understanding memory system performance (i.e., understanding a workload's interaction with the memory system). On systems with a non-uniform memory architecture (NUMA) the performance critically depends on the distribution of data and computations. The actual memory access patterns have a large influence(More)
Experience has shown that thread-based parallel programming is prone to error. One particularly common kind of error is a data race. A data race occurs when multiple parallel, unordered threads access the same memory location, and at least one of those tasks is performing a write. Unfortunately, data races are not only common, they are also extremely(More)
Protecting running applications is a hard problem. Many applications are written in a low-level language and are prone to exploits. Bugs can be used to exploit the application and to run malicious code. A rigorous code review is often not possible due to the size and the complexity of the applications. Even a detailed code review does not guarantee that all(More)
With the arrival of multicore systems, parallel programming is becoming increasingly mainstream. Writing correct parallel programs, however, has turned out to be difficult and prone to errors without proper support from the employed programming languages, compilers, and runtime systems. Over the last years, researchers and engineers have developed numerous(More)
  • 1