Algorithmic dimensionality reduction for molecular structure analysis.

  title={Algorithmic dimensionality reduction for molecular structure analysis.},
  author={W. Michael Brown and Shawn Martin and Sara N. Pollock and Evangelos A. Coutsias and Jean-Paul Watson},
  journal={The Journal of chemical physics},
  volume={129 6},
Dimensionality reduction approaches have been used to exploit the redundancy in a Cartesian coordinate representation of molecular motion by producing low-dimensional representations of molecular motion. This has been used to help visualize complex energy landscapes, to extend the time scales of simulation, and to improve the efficiency of optimization. Until recently, linear approaches for dimensionality reduction have been employed. Here, we investigate the efficacy of several automated… 

Figures from this paper

Using Dimensionality Reduction to Analyze Protein Trajectories

This paper analyzed a molecular dynamics trajectory of the C-terminal fragment of the immunoglobulin binding domain B1 of protein G of Streptococcus modeled in explicit solvent using a range of different dimensionality reduction algorithms and tried to systematically compare the projections generated using each of these algorithms by using a clustering algorithm to find the positions and extents of the basins in the high-dimensional energy landscape.

Geometric Issues in Dimensionality Reduction and Protein Conformation Space

The puzzling dimensionally reduction results of β-hairpin are discussed where the linear method PCA performed better than nonlinear methods ISOMAP and LLE and it is shown that nonlinear surfaces without certain specified properties are not necessarily better suited for nonlinear dimensionality reduction methods than linear methods.

Metadynamics in the conformational space nonlinearly dimensionally reduced by Isomap.

A simulation with a bias potential acting in the directions of collective motions determined by a nonlinear dimensionality reduction method is presented, which allows to use essentially any parameter of the system as a collective variable in biased simulations.

UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study.

The comparison of the raw high-dimensional data with the projections obtained using different dimensionality reduction methods based on various metrics showed that UMAP has superior performance when compared with linear reduction methods (PCA and tICA) and has competitive performance and scalable computational cost.

Evaluation of Dimensionality-reduction Methods from Peptide Folding-unfolding Simulations.

This study evaluated several non linear methods, locally linear embedding, Isomap, and diffusion maps, as well as principal component analysis from the equilibrium folding/unfolding trajectory of the second β-hairpin of the B1 domain of streptococcal protein G.

Machine learning in multiscale modeling and simulations of molecular systems

A novel method is proposed, atlas of collective variables, that systematically overcomes topological obstacles, ameliorates the geometrical distortions and thus allows NLDR techniques to perform optimally in molecular simulations.

Unsupervised Learning Methods for Molecular Simulation Data

This Review provides a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicates likely directions for further developments in the field.

Diffusion maps, clustering and fuzzy Markov modeling in peptide folding transitions.

It is demonstrated how manifold learning techniques may complement and enhance informed intuition commonly used to construct reduced descriptions of the dynamics in molecular conformation space to construct robust Markov state models.

Constructing Grids for Molecular Quantum Dynamics Using an Autoencoder.

A machine learning approach is presented that utilizes an autoencoder that is trained to find a low-dimensional representation of a set of molecular configurations that can be used to generate a potential energy surface grid in the desired subspace.



Nonlinear Dimensionality Reduction

The purpose of the book is to summarize clear facts and ideas about well-known methods as well as recent developments in the topic of nonlinear dimensionality reduction, which encompasses many of the recently developed methods.

Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction

The proposed method to obtain a few collective coordinates by using nonlinear dimensionality reduction can efficiently find a low-dimensional representation of a complex process such as protein folding.

A global geometric framework for nonlinear dimensionality reduction.

An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

Efficient sampling in collective coordinate space

A novel method that combines the simulation of an ensemble of concurrent trajectories with restraints acting on the ensemble of structures as a whole is presented, used to probe the resistance of a structure against conformational changes along collective modes and clearly distinguishes soft from stiff modes.

Dihedral angle principal component analysis of molecular dynamics simulations.

It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide.

Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis

The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA, and showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPC a maps are more pure.

Can principal components yield a dimension reduced description of protein dynamics on long time scales?

A quantitative evaluation of the convergence of conformational coordinates obtained with principal component analysis suggests that simulations of a few nanoseconds should generally provide a stable and statistically reliable definition of the essential and near constraints subspaces.

Dynamics of essential collective motions in proteins: theory.

  • M. Stepanova
  • Physics
    Physical review. E, Statistical, nonlinear, and soft matter physics
  • 2007
A rigorous theoretical background is provided for identification of dynamic correlated domains in a macromolecule and to construct coarse-grained models representing the conformational motions in a protein through a few interacting domains embedded in a dissipative medium.

Collective Langevin dynamics of conformational motions in proteins.

Collective Langevin dynamics (CLD), which evolves the dynamics of the system within a small subspace of relevant collective degrees of freedom, yielded accurate thermodynamical and dynamical behaviors.

Complete maps of molecular‐loop conformational spaces

This paper presents a numerical method to compute all possible conformations of distance‐constrained molecular loops, i.e., loops where some interatomic distances are held fixed, while others can vary, allowing an exhaustive analysis and visualization of all pseudo‐rotation paths between different conformations satisfying loop closure.