Deep Learning on Operational Facility Data Related to Large-Scale Distributed Area Scientific Workflows

  title={Deep Learning on Operational Facility Data Related to Large-Scale Distributed Area Scientific Workflows},
  author={Alok Singh and Eric G. Stephan and Malachi Schram and Ilkay Altintas},
  journal={2017 IEEE 13th International Conference on e-Science (e-Science)},
Distributed computing platforms provide a robust mechanism to perform large-scale computations by splitting the task and data among multiple locations, possibly located thousands of miles apart geographically. Although such distribution of resources can lead to benefits, it also comes with its associated problems such as rampant duplication of file transfers increasing congestion, long job completion times, unexpected site crashing, suboptimal data transfer rates, unpredictable reliability in a… 

Figures from this paper

Toward a Methodology and Framework for Workflow-Driven Team Science

A conceptual design toward the development of methodologies and services for effective workflow-driven collaborations, namely the PPoDS methodology for collaborative workflow development and the SmartFlows Services for smart execution in a rapidly evolving cyberinfrastructure ecosystem are presented.

A demonstration of modularity, reuse, reproducibility, portability and scalability for modeling and simulation of cardiac electrophysiology using Kepler Workflows

This article develops, describes and test a computational workflow that serves as a proof of concept of a platform for the robust integration and implementation of a reusable and reproducible multi-scale cardiac cell and tissue model that is expandable, modular and portable.



Leveraging large sensor streams for robust cloud control

Using Machine Learning modeling techniques on data from a real instrumented cluster, it is demonstrated that predictive modeling on operational sensor data can directly reduce systems operations monitoring costs and improve system reliability.

Data provenance hybridization supporting extreme-scale scientific workflow applications

IPPD's provenance management solution (ProvEn) and its hybrid data store combining both of these data provenance perspectives are described and design and implementation details that include provenance disclosure, scalability, data integration, and a discussion on query and analysis capabilities are discussed.

Machine learning - a probabilistic perspective

  • K. Murphy
  • Computer Science
    Adaptive computation and machine learning series
  • 2012
This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

Performance of combined production and analysis WMS in DIRAC

The performance of the DIRAC WMS will be presented with emphasis on how the system copes with many varied job requirements, and experience with gLExec will be described.

Long Short-Term Memory

A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

Greedy Layer-Wise Training of Deep Networks

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

Learning Deep Architectures for AI

The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.

Deep Learning of Representations: Looking Forward

This paper proposes to examine some of the challenges of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data.

Training Recurrent Neural Networks

A new probabilistic sequence model that combines Restricted Boltzmann Machines and RNNs is described, more powerful than similar models while being less difficult to train, and a random parameter initialization scheme is described that allows gradient descent with momentum to train Rnns on problems with long-term dependencies.

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding

The results show that on this task, both types of recurrent networks outperform the CRF baseline substantially, and a bi-directional Jordantype network that takes into account both past and future dependencies among slots works best, outperforming a CRFbased baseline by 14% in relative error reduction.