Addressing the unmet need for visualizing conditional random fields in biological data

  title={Addressing the unmet need for visualizing conditional random fields in biological data},
  author={William C. Ray and Samuel L. Wolock and Nicholas W. Callahan and Min Dong and Q. Quinn Li and Chun Liang and Thomas J. Magliery and Christopher W. Bartlett},
  journal={BMC Bioinformatics},
  pages={202 - 202}
BackgroundThe biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many… 
The Importance of Weakly Co-Evolving Residue Networks in Proteins is Revealed by Visual Analytics
A visualization approach that eschews the common “look at a long list of statistics” approach and instead enables the user to literally look at all of the co-evolution statistics simultaneously simultaneously is developed, which sheds light on the biophysical importance of different types of co-Evolution.
Random Fields in Physics, Biology and Data Science
A random field is the representation of the joint probability distribution for a set of random variables. Markov fields, in particular, have a long standing tradition as the theoretical foundation of
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
A visual analytics tool, StickWRLD, is developed, which creates an interactive 3D representation of a protein alignment and clearly displays covarying residues, and has previously been successfully used to identify functionally-required covaryed residues in proteins such as Adenylate Kinase and in DNA sequences such as endonuclease target sites.
Comparative Analysis between Notations to Classify Named Entities using Conditional Random Fields
It is found out that IO notation presents better results in F-measure than BILOU notation in all categories of HAREM corpus.


MAVL/StickWRLD for protein: visualizing protein sequence families to detect non-consensus features
MAVL (multiple alignment variation linker) and StickWRLD provide a web-based method to visually survey the model-training sequences to discover and characterize possible dependencies and the basic visualization has been augmented in several ways to enhance protein viewing.
Beyond Identity- When Classical Homology Searching Fails, Why, and What you Can do About It
The results of this analysis and the generalization of the modeling method are presented, which allows development of models of similar power for arbitrary RNA families.
MAVL and StickWRLD: visually exploring relationships in nucleic acid sequence alignments
MAVL is the web-based application for detecting and displaying both positive and negative inter-positional correlations in nucleic acid sequences and StickWRLD is the virtual reality modeling language for this application.
Flexible Linked Axes for Multivariate Data Visualization
This work explores how to give the user more freedom to define visualizations, based on the usage of Flexible Linked Axes, and allows users to define composite visualizations that automatically support brushing and linking.
Parallel sets: visual analysis of categorical data
Parallel sets is a new visualization method that adopts the layout of parallel coordinates, but substitutes the individual data points by a frequency based representation, which allows efficient work with meta data, which is particularly important when dealing with categorical datasets.
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
A New Axes Re-ordering Method in Parallel Coordinates Visualization
A new axes re-ordering method in parallel coordinates visualization is proposed: a similarity-based method, which is based on the combination of Nonlinear Correlation Coefficient (NCC) and Singular Value Decomposition (SVD) algorithms.
Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2
PolyPhen‐2 (Polymorphism Phenotyping v2), available as software and via a Web server, predicts the possible impact of amino acid substitutions on the stability and function of human proteins using
Mapping Nominal Values to Numbers for Effective Visualization
A new technique is proposed, called the Distance-Quantification-Classing (DQC) approach, to preprocess nominal variables before being imported into a visual exploration tool, and the XmdvTool package is extended to incorporate this approach.
Many-to-Many Relational Parallel Coordinates Displays
A new configuration of the axes in multiple parallel coordinates displays is devised and the usability of this new configuration is investigated and the results are promising.