Theodore S. Papatheodorou

Learn More
The bus that connects processors to memory is known to be a major architectural bottleneck in SMPs. However, both software and scheduling policies for these systems generally focus on memory hierarchy optimizations and do not address the bus bandwidth limitations directly. In this paper, we first present experimental results which indicate that bus(More)
In this paper we reformulate the thread scheduling problem on multiprogrammed SMPs. Scheduling algorithms usually attempt to maximize performance of memory intensive applications by optimally exploiting the cache hierarchy. We present experimental results indicating that contrary to the common belief the extent of performance loss of memory-intensive,(More)
In this paper we propose and study a framework for evaluating Hypermedia Application Development and Management Systems (HADMS) in relation to specific application requirements. We address the need for HADMS capable to efficiently support the main users involved in the life cycle of hypermedia applications, namely designers, programmers/implementers,(More)
Multiprocessor systems are increasingly becoming the systems of choice for low and high-end servers, running such diverse tasks as number crunching, large-scale simulations, data base engines and world wide web server applications. With such diverse workloads, system utilization and throughpuf as well as execution time become important performance metrics.(More)
!#"$ %& "('&) * +&,.-0/213+ 4 5,6-$4& 7+&89-0:&* ;< = $>? @ AB C $D = @EF C A G B & H I J6 KCI C B I$I IL 3 M&N O B &>? MP = C Q R S>? >?M& JT UVM >G & CA WMX ZY3 7 $K = [Y C [Y\>? >?M& J]>6= IR $ 3 M A M ;^EF R _UV Y K A `aMXK3 B & C [Y Ub MX>cY3J3 >? A^>?MX M O M&Ud>? >?M Je &A f R gJ hjik= 2 A 3N $D = l m3 CI M& F C k B fn ( H = lM Uo H B &I I Ip 3 M
An algorithm for approximating certain classes of elliptic partial differential equations on a rectangle is presented. The algorithm uses high-order 9-point difference approximations to the Helmholtz-type (fourth-order) or Polsson (sixth-order) equations and the fast Fourier transform. Compared to efficient second-order fast direct methods for smooth(More)
This paper investigates the performance implications of data placement in OpenMP programs running on modern ccNUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on these systems. We show that due to the low remote-to-local memory access latency ratio of state-of-the-art(More)
This paper investigates the performance of synchronization algorithms on ccNUMA multiprocessors, from the perspectives of the architecture and the operating system. In contrast with previous related studies that emphasized the relative performance of synchronization algorithms, this paper takes a new approach by analyzing the sources of synchronization(More)
This paper assesses the performance and scalability of several software synchronization algorithms, as well as the interrelationship between synchronization, multiprogramming and parallel job scheduling, on ccNUMA systems. Using the SGl Origin2000, we evaluate synchronization algorithms for spin locks, lock-free concurrent queues, and barriers. We analyze(More)
The growing availability of Linked Data and other structured information on the Web does not keep pace with the rich semantic descriptions and conceptual associations that would be necessary for direct deployment of user-tailored services. In contrast, the more complex descriptions become, the harder it is to reason about them. To show the efficacy of a(More)