#### Filter Results:

- Full text PDF available (40)

#### Publication Year

1995

2016

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

- Nagiza F. Samatova, George Ostrouchov, Al Geist, Anatoli V. Melechko
- Distributed and Parallel Databases
- 2002

This paper presents a hierarchical clustering method named RACHET (Recursive Agglomeration of Clustering Hierarchies by Encircling Tactic) for analyzing multi-dimensional distributed data. A typical clustering algorithm requires bringing all the data in a centralized warehouse. This results in O(nd) transmission cost, where n is the number of data points… (More)

- Narate Taerat, Nichamon Naksinehaboon, +5 authors Christian Engelmann
- 2009 International Conference on Availability…
- 2009

System- and application-level failures could be characterized by analyzing relevant log files. The resulting data might then be used in numerous studies on and future developments for the mission-critical and large scale computational architecture, including fields such as failure prediction, reliability modeling, performance modeling and power awareness.… (More)

- Jeremy S. Logan, Scott Klasky, +6 authors Matthew Wolf
- Euro-Par
- 2012

We address the difficulty involved in obtaining meaningful measurements of I/O performance in HPC applications, as well as the further challenge of understanding the causes of I/O bottlenecks in these applications. The need for I/O optimization is critical given the difficulty in scaling I/O to ever increasing numbers of processing cores. To address this… (More)

This paper presents a novel algorithm for identification and functional characterization of "key" genome features responsible for a particular biochemical process of interest. The central idea is that individual genome features are identified as "key" features if the discrimination accuracy between two classes of genomes with respect to a given biochemical… (More)

We describe a new method for computing a global principal component analysis (PCA) for the purpose of dimension reduction in data distributed across several locations. We assume that a virtual n × p (items × features) data matrix is distributed by blocks of rows (items), where n > p and the distribution among s locations is determined by a given… (More)

Overview: The tutorial will introduce attendees to high performance computing concepts for dealing with big data using R, particularly on large distributed platforms. We will describe the use of the " programming with big data in R " (pbdR) package ecosystem by presenting several examples of varying complexity. Our packages provide infrastructure to use and… (More)

- Jingqian Jiang, Michael W. Berry, June M. Donato, George Ostrouchov, Nancy W. Grady
- Intell. Data Anal.
- 1999

Systemic pathways-oriented approaches to analysis of metabolic networks are effective for small networks but are computationally infeasible for genome scale networks. Current computational approaches to this analysis are based on the mathematical principles of convex analysis. The enumeration of a complete set of " systemicallyindependent "… (More)

- Stephen L. Scott, Christian Engelmann, +11 authors Jyothish Varma
- PPOPP
- 2009

In order to address anticipated high failure rates, resiliency characteristics have become an urgent priority for next-generation extreme-scale high-performance computing (HPC) systems. This poster describes our past and ongoing efforts in novel fault resilience technologies for HPC. Presented work includes proactive fault resilience techniques, system and… (More)

- George Ostrouchov, Nagiza F. Samatova
- IEEE Transactions on Pattern Analysis and Machine…
- 2005

FastMap is a dimension reduction technique that operates on distances between objects. Although only distances are used, implicitly the technique assumes that the objects are points in a p-dimensional Euclidean space. It selects a sequence of k /spl les/ p orthogonal axes defined by distant pairs of points (called pivots) and computes the projection of the… (More)