Mohammad Zubair

Learn More
In this paper we propose a feature extraction based algorithm (FEBA) for the sparse matriz-vector multiplication. The key idea of FEBA is to ezploit any regular structure present in the sparse matriz by extracting it and processing it separately. The order in which these structures are eztracted is determined by the relative eficiency with which they can be(More)
With the advent of multicore and many core architectures, we are facing a problem that is new to parallel computing, namely, the management of hierarchical parallel caches. One major limitation of all earlier models is their inability to model multicore processors with varying degrees of sharing of caches at different levels. We propose a unified memory(More)
The usefulness of the many on-line journals and scientific digital lib raries that exist today is limited by the lack of a service that can federate them through a unified interface. The Open Archive Initiative (OAI) is one major effort to address technical interoperability among distributed archives. The objective of OAI is to develop a framework to(More)
In this paper, we propose a scheme for matrix-matrix multiplication on a distributedmemory parallel computer . The scheme hides almost all of the communication cost with the computation and uses the standard, optimized Level-3 BLAS operation on each node . As a result, the overall performance of the scheme is nearly equal to the performance of the Level-3(More)
In this paper, we introduce a concept called algorithmic prefetching, for exploiting some of the features of the IBM RISC System/6000@’ computer. Algorithmic prefetching denotes changing algorithm A to algorithm B, which contains additional steps to move data from slower levels of memory to faster levels, with the aim that algorithm B outperform algorithm(More)
In this paper we propose a parallel high performance FFT algorithm based on a multi-dimensional formulation. We use this to solve a commonly encountered FFT based kernel on a distributed memory parallel machine, the IBM scalable parallel system, SP1. The kernel requires a forward FFT computation of an input sequence, multiplication of the transformed data(More)
This paper describes our efforts to develop a toolset and process for automated metadata extraction from large, diverse, and evolving document collections. A number of federal agencies, universities, laboratories, and companies are placing their collections online and making them searchable via metadata fields such as author, title, and publishing(More)
In this paper we propose a parallel algorithm for the planted motif problem that arises in computational biology. A variety of algorithms have been proposed in the literature to solve this problem. The drawback of all these algorithms is that they have been designed to work on serial computers; and are not suitable for parallelization on current multicore(More)