Statistical Inference for Big Data Problems in Molecular Biophysics
@inproceedings{Ramanathan2012StatisticalIF, title={Statistical Inference for Big Data Problems in Molecular Biophysics}, author={Arvind Ramanathan and Andrej J. Savol and Virginia M. Burger and Shannon P. Quinn and Pratul K. Agarwal and Chakra Chennubhotla}, year={2012} }
We highlight the role of statistical inference techniques in providing biological insights from analyzing long time-scale molecular simulation data. Technological and algorithmic improvements in computation have brought molecular simulations to the forefront of techniques applied to investigating the basis of living systems. While these longer simulations, increasingly complex reaching petabyte scales presently, promise a detailed view into microscopic behavior, teasing out the important…
Figures from this paper
6 Citations
Distributed Spectral Graph Methods for Analyzing Large-Scale Unstructured Biomedical Data
- Biology
- 2014
A quantitative model of ciliary motion phenotypes is developed, using spectral graph methods for unsupervised latent pattern discovery and a distributed hierarchical eigensolver is compared directly to other popular solvers for its essential role in enabling the discovery of novel ciliaryMotion phenotypes and in identifying physiochemical-perceptual associations.
AI-Driven Multiscale Simulations Illuminate Mechanisms of SARS-CoV-2 Spike Dynamics
- Computer SciencebioRxiv
- 2020
A generalizable AI-driven workflow is developed that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems and demonstrates how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.
Benchmarking Machine Learning Workloads in Structural Bioinformatics Applications
- Computer Science
- 2020
This paper presents an overview of different learning approaches in structural bioinformatics applications, performance considerations for such coupled applications, and outline the development of performance metrics, and hopes that this could serve as a framework for other application domains.
AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics
- Computer ScienceInt. J. High Perform. Comput. Appl.
- 2021
A generalizable AI-driven workflow is developed that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems and presents several novel scientific discoveries, including the elucidation of the spike’s full glycan shield and the characterization of the flexible interactions between the spike and the human ACE2 receptor.
Challenges and frontiers of computational modelling of biomolecular recognition
- Biology, ChemistryQRB Discovery
- 2022
The challenges and computational approaches developed to characterise biomolecular binding, including molecular docking, molecular dynamics simulations (especially enhanced sampling) and machine learning are reviewed.
Deep clustering of protein folding simulations
- Biology, Computer ScienceBMC Bioinformatics
- 2018
The CVAE model can quantitatively describe complex biophysical processes such as protein folding, and can be used to learn latent features of protein folding that can be applied to other independent trajectories, making it particularly attractive for identifying intrinsic features that correspond to conformational substates that share similar structural features.
References
SHOWING 1-10 OF 18 REFERENCES
Event detection and sub‐state discovery from biomolecular simulations using higher‐order statistics: Application to enzyme adenylate kinase
- BiologyProteins
- 2012
HOST4MD is presented—a higher‐order statistical toolbox for molecular dynamics simulations, which identifies key dynamical events as simulations are in progress, explores potential sub‐ states, and identifies conformational transitions that enable the protein to access those sub‐states.
On-the-Fly Identification of Conformational Substates from Molecular Dynamics Simulations.
- BiologyJournal of chemical theory and computation
- 2011
It is demonstrated that the patterns discovered by DTA often correspond to functionally important conformational substates and is well-suited to analyzing long timescale simulations, which are critical for studying biologically relevant motions but may be too large for traditional analysis methods.
Full correlation analysis of conformational protein dynamics
- BiologyProteins
- 2008
FCA should provide improved collective degrees of freedom for dimension‐reduced descriptions of macromolecular dynamics and is shown to be due to a strongly increased anharmonicity of FCA modes as compared to the respective PCA modes.
Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms.
- Computer ScienceJournal of chemical theory and computation
- 2007
There is no one perfect "one size fits all" algorithm for clustering MD trajectories and that the results strongly depend on the choice of atoms for the pairwise comparison, so the best performance was observed with the average-linkage, means, and SOM algorithms.
Discovering Conformational Sub-States Relevant to Protein Function
- ChemistryPloS one
- 2011
Quasi-anharmonic analysis (QAA) provides a novel framework to intuitively understand the biophysical basis of conformational diversity and its relevance to protein function.
Transiently populated intermediate functions as a branching point of the FF domain folding pathway
- BiologyProceedings of the National Academy of Sciences
- 2012
This study establishes the FF domain intermediate as a central player in both folding and misfolding pathways and illustrates how incomplete folding can lead to the formation of higher-order structures.
Hidden alternate structures of proline isomerase essential for catalysis
- ChemistryNature
- 2009
Dual strategies of ambient-temperature X-ray crystallographic data collection and automated electron-density sampling are introduced to structurally unravel interconverting substates of the human proline isomerase, cyclophilin A (CYPA).
Accessing a Hidden Conformation of the Maltose Binding Protein Using Accelerated Molecular Dynamics
- BiologyPLoS Comput. Biol.
- 2011
Periplasmic binding proteins (PBPs) are a large family of molecular transporters that play a key role in nutrient uptake and chemotaxis in Gram-negative bacteria. All PBPs have characteristic…
A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories
- Computer Science2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
- 2008
A new parallel analysis framework called HiMach, which allows users to write trajectory analysis programs sequentially, and carries out the parallel execution of the programs automatically, and an extension to the original MapReduce model to support multiple rounds of analysis.