Geoffrey C. Fox

Learn More
MapReduce programming model has simplified the implementation of many data parallel applications. The simplicity of the programming model and the quality of services provided by many implementations of MapReduce attract a lot of enthusiasm among distributed computing communities. From the years of experience in applying MapReduce to various scientific(More)
This report consists of two major portions. First is the presentation of a methodology for measuring the performance of supercomputers. This includes a set of thirteen Fortran programs that total well over 50,000 lines of source code. They represent applications in a number of areas of engineering and scienti c computing, and in many cases they represent(More)
Workflows have emerged as a paradigm for representing and managing complex distributed computations and are used to accelerate the pace of scientific progress. A recent National Science Foundation workshop brought together domain, computer, and social scientists to discuss requirements of future scientific applications and the challenges they present to(More)
Most scientific data analyses comprise analyzing voluminous data collected from various instruments. Efficient parallel/concurrent algorithms and frameworks are the key to meeting the scalability and performance requirements entailed in such scientific data analyses. The recently introduced MapReduce technique has gained a lot of attention from the(More)
A Peer-to-Peer (P2P) Grid would comprise services that include those of Grids and P2P networks and naturally support environments that have features of both limiting cases. Such a P2P grid integrates the evolving ideas of computational grids, distributed objects, web services, P2P networks and message oriented middleware. In this paper we investigate the(More)
A deterministic annealing technique is proposed for the nonconvex optimization problem of clustering. Deterministic annealing is used in order to avoid local minima of the given cost function which trap traditional techniques. A set of temperature parametrized Gibbs probability density functions relate each data point to each cluster. An effective cost(More)
This paper describes the design of the Fortran90D/HPF compiler, a source-to-source parallel compiler for distributed memory systems being developed at Syracuse University. Fortran 90D/HPF is a data parallel language with special directives to specify data alignment and distributions. A systematic methodology to process distribution directives of Fortran(More)
As Cloud computing emerges as a dominant paradigm in distributed systems, it is important to fully understand the underlying technologies that make Clouds possible. One technology, and perhaps the most important, is virtualization. Recently virtualization, through the use of hyper visors, has become widely used and well understood by many. However, there(More)