#### Filter Results:

- Full text PDF available (27)

#### Publication Year

1997

2017

- This year (6)
- Last 5 years (14)
- Last 10 years (29)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

#### Method

Learn More

- Shiraj Khan, Sharba Bandyopadhyay, +4 authors George Ostrouchov
- Physical review. E, Statistical, nonlinear, and…
- 2007

Commonly used dependence measures, such as linear correlation, cross-correlogram, or Kendall's tau , cannot capture the complete dependence structure in data unless the structure is restricted to linear, periodic, or monotonic. Mutual information (MI) has been frequently utilized for capturing the complete dependence structure including nonlinear… (More)

We describe a new method for computing a global principal component analysis (PCA) for the purpose of dimension reduction in data distributed across several locations. We assume that a virtual n × p (items × features) data matrix is distributed by blocks of rows (items), where n > p and the distribution among s locations is determined by a given… (More)

- Nagiza F. Samatova, George Ostrouchov, Al Geist, Anatoli V. Melechko
- Distributed and Parallel Databases
- 2002

This paper presents a hierarchical clustering method named RACHET (Recursive Agglomeration of Clustering Hierarchies by Encircling Tactic) for analyzing multi-dimensional distributed data. A typical clustering algorithm requires bringing all the data in a centralized warehouse. This results in O(nd) transmission cost, where n is the number of data points… (More)

- Jeremy S. Logan, Scott Klasky, +6 authors Matthew Wolf
- Euro-Par
- 2012

We address the difficulty involved in obtaining meaningful measurements of I/O performance in HPC applications, as well as the further challenge of understanding the causes of I/O bottlenecks in these applications. The need for I/O optimization is critical given the difficulty in scaling I/O to ever increasing numbers of processing cores. To address this… (More)

- George Ostrouchov, Nagiza F. Samatova
- IEEE Transactions on Pattern Analysis and Machine…
- 2005

FastMap is a dimension reduction technique that operates on distances between objects. Although only distances are used, implicitly the technique assumes that the objects are points in a p-dimensional Euclidean space. It selects a sequence of k /spl les/ p orthogonal axes defined by distant pairs of points (called pivots) and computes the projection of the… (More)

- Narate Taerat, Nichamon Naksinehaboon, +5 authors Christian Engelmann
- 2009 International Conference on Availability…
- 2009

System- and application-level failures could be characterized by analyzing relevant log files. The resulting data might then be used in numerous studies on and future developments for the mission-critical and large scale computational architecture, including fields such as failure prediction, reliability modeling, performance modeling and power awareness.… (More)

- Faisal N. Abu-Khzam, Nagiza F. Samatova, George Ostrouchov, Michael A. Langston, Al Geist
- IASTED PDCS
- 2002

It is well known that information retrieval, clustering and visualization can often be improved by reducing the dimensionality of high dimensional data. Classical techniques offer optimality but are much too slow for extremely large databases. The problem becomes harder yet when data are distributed across geographically dispersed machines. To address this… (More)

- Jeremy Bejarano, Koushiki Bose, +4 authors George Ostrouchov
- 2011

Due to current data collection technology, our ability to gather data has surpassed our ability to analyze it. In particular, k-means, one of the simplest and fastest clustering algorithms, is ill-equipped to handle extremely large datasets on even the most powerful machines. Our new algorithm uses a sample from a dataset to decrease runtime by reducing the… (More)

Systemic pathways-oriented approaches to analysis of metabolic networks are effective for small networks but are computationally infeasible for genome scale networks. Current computational approaches to this analysis are based on the mathematical principles of convex analysis. The enumeration of a complete set of “systemically independent” metabolic… (More)