Learn More
Isoeffiency analysis helps us determine the best akorith m/a rch itecture combination for a particular p ro blem without explicitly analyzing all possible combinations under all possible conditions. T he fastest sequential algorithm for a given problem is the best sequential algorithm. But determining the best parallel algorithm is considerably more(More)
We present algorithms for the symbolic and numerical factorization phases in the direct solution of sparse unsymmetric systems of linear equations. We have modified a classical symbolic factorization algorithm for unsymmetric matrices to inexpensively compute minimal elimination structures. We give an efficient algorithm to compute a near-minimal(More)
In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm substantially improves the state of the art in(More)
LIMITED DISTRIBUTION NOTICE This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be(More)
This paper presents a design flow that optimizes a standard cell based circuit for performance by implementing critical paths in a Programmable Logic Array (PLA). Given a standard-cell based circuit as input, our approach iteratively extracts critical paths from this circuit, which are then implemented using a PLA circuit. PLAs are a good candidate for such(More)
In this paper, we describe scalable parallel algorithms for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithms substantially improve the state of the art in(More)
In this paper, we present the scalability analysis of parallel Fast Fourier Transform algorithm on mesh and hypercube connected multicomputers using the isoefficiency metric. The isoefficiency function of an algorithm architecture combination is defined as the rate at which the problem size should grow with the number of processors to maintain a fixed(More)
There are several metrics that characterize the performance of a parallel system, such as, parallel execution time, speedup and eeciency. A number of properties of these metrics have been studied. For example, it is a well known fact that given a parallel architecture and a problem of a xed size, the speedup of a parallel algorithm does not continue to(More)