Aman Singla

Learn More
An important attribute in the specification of many compute-intensive applications is " time ". Simulation of interactive virtual environments is one such domain. There is a mismatch between the synchronization and consistency guarantees needed by such applications (which are temporal in nature) and the guarantees offered by current shared memory systems.(More)
The goal of this work is to explore architectural mechanisms for supporting explicit communication in cache-coherent shared memory multiprocessors. The motivation stems from the observation that applications display wide diversity in terms of sharing characteristics and hence impose different communication requirements on the system. Explicit communication(More)
Scalability studies of parallel architectures have used scalar metrics to evaluate their performance. Very often, it is diicult to glean the sources of ineeciency resulting from the mismatch between the algorithmic and architectural requirements using such scalar metrics. Low-level performance studies of the hardware are also inadequate for predicting the(More)
The overheads in a parallel system that limit its scalability need to be identified and separated in order to enable parallel algorithm design and the development of parallel machines. Such overheads may be broadly classified into two components. The first one is intrinsic to the algorithm and arises due to factors such as the work-imbalance and the serial(More)
We consider the problems of finding minimum P-edge connected and P-vertex connected subgraphs in a given graph. These problems are NP-hard. We provide better techniques to lower bound the size of the minimum subgraphs. This allows us to achieve approximation factors of a and $ respectively , thereby improving on existing algorithms that achieve factors p(More)
Synthesizing architectural requirements from an application viewpoint can help in making important architectural design decisions towards building large scale parallel machines. In this paper, we quantify the link bandwidth requirement on a binary hypercube topology for a set of five parallel applications. We use an execution-driven simulator called SPASM(More)
ing features of parallel systems is a technique that has been traditionally used in theoretical and analytical models for program development and performance evaluation. In this paper, we explore the use of abstractions in execution-driven simulators in order to speed up simulation. In particular, we evaluate abstractions for the interconnection network and(More)
In this paper we present a new approach to benchmark the performance of shared memory systems. This approach focuses on recognizing how far off is the performance of a given memory system from a realistic ideal parallel machine. We define such a realistic machine model called the z-machine, which accounts for the inherent communication costs in an(More)
Evaluating and analyzing the performance of a parallel application on an architecture to explain the disparity between projected and delivered performance is an important aspect of parallel systems research. However, conducting such a study is hard due to the vast design space of these systems. In this paper, we study two important aspects related to the(More)
1 Introduction Scalability is a term frequently used to qualify the match between an algorithm and architecture in a parallel system (an algorithm-architecture combination). Evaluating the scalability of a parallel system has widespread applicability. The results from such an evaluation may be used to: select the best architecture platform for an(More)