Minimotif Miner: a tool for investigating protein function

  title={Minimotif Miner: a tool for investigating protein function},
  author={Sudha Balla and Vishal Thapar and Snigdha Verma and T. Y. Luong and Tanaz Faghri and Chun-Hsi Huang and Sanguthevar Rajasekaran and Jacob J. del Campo and Jessica H Shinn and William Mohler and Mark W. Maciejewski and Michael Robert Gryk and Bryan Piccirillo and Stanley R Schiller and Martin R. Schiller},
  journal={Nature Methods},
In addition to large domains, many short motifs mediate functional post-translational modification of proteins as well as protein-protein interactions and protein trafficking functions. We have constructed a motif database comprising 312 unique motifs and a web-based tool for identifying motifs in proteins. Functional motifs predicted by MnM can be ranked by several approaches, and we validated these scores by analyzing thousands of confirmed examples and by confirming prediction of previously… 

Minimotif Miner: A Computational Tool to Investigate Protein Function, Disease, and Genetic Diversity

  • M. Schiller
  • Biology, Computer Science
    Current protocols in protein science
  • 2007
Scoring based on evolutionary conservation, protein surface prediction, and motif frequency can be used in conjunction with other motif programs and the known biology of the query to reduce false‐positive predictions and select short motifs for experimental pursuit.

Computational Techniques for Motif Search

This paper survey some of the techniques that have been proposed in the literature for motifs identification in the study of protein-protein interactions and finds that they are typically very short.

Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset

This investigation demonstrates the important effects of a high-quality negative dataset on the performance of such statistical inference and proposes an approach to discovering motif pairs at interaction sites that are essential for understanding protein functions and helpful for the rational design of protein engineering and folding experiments.

Structural conservation of a short, functional, peptide-sequence motif.

The many bioinformatic tools and resources which discover, define and catalogue the various, known protein domains as well as assist users by identifying domain signatures within proteins of interest are reviewed.

Large-Scale Discovery and Characterization of Protein Regulatory Motifs in Eukaryotes

It is shown that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach and these predicted motifs provide focused hypotheses for experimental validation.

In silico protein motif discovery and structural analysis.

This chapter collects some of the most common and useful tools available for protein motif discovery and secondary and tertiary structure prediction from a primary amino acid sequence and provides pointers to many other tools.

A computational strategy for the prediction of functional linear peptide motifs in proteins

This procedure for the identification of functional motifs, which scores pattern conservation in homologous sequences by taking explicitly into account the sequence similarity to the query sequence, is developed and helpful to guide experiments because it allows focusing on those short linear peptide motifs that have a high probability to be functional.

Computational identification and analysis of protein short linear motifs.

Focusing on disordered regions of proteins, where SLiMs are predominantly found, and masking out non-conserved residues can reduce the level of noise but more work is required to improve the quality of high-throughput experimental datasets as input for computational discovery.

A new protein linear motif benchmark for multiple sequence alignment software

None of the programs currently available is capable of reliably aligning LMs in distantly related sequences and a number of specific problems are highlighted.

Discovering Interaction Motifs from Protein Interaction Networks

This chapter provides a review on the different approaches on finding interaction motifs with a discussion on their implications, potentials and possible areas of improvements in the future.



ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins

ELM, the Eukaryotic Linear Motif server at, is a new bioinformatics resource for investigating candidate short non-globular functional motifs in eukaryosis proteins, aiming to fill the void in bio informatics tools.

Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs

Scansite identifies short protein sequence motifs that are recognized by modular signaling domains, phosphorylated by protein Ser/Thr- or Tyr-kinases or mediate specific interactions with protein or phospholipid ligands, allowing segments of biological pathways to be constructed in silico.

Prediction of post‐translational glycosylation and phosphorylation of proteins from the amino acid sequence

A new method for kinase‐specific prediction of phosphorylation sites, NetPhosK, is presented, which extends the earlier and more general tool, netPhos, and the issues of underestimation, over‐prediction and strategies for improving prediction specificity are discussed.

Sequence and structure-based prediction of eukaryotic protein phosphorylation sites.

An artificial neural network method is presented that predicts phosphorylation sites in independent sequences with a sensitivity in the range from 69 % to 96 % and predicts novel phosphorylated sites in the p300/CBP protein that may regulate interaction with transcription factors and histone acetyltransferase activity.

14-3-3 Proteins: Active Cofactors in Cellular Regulation by Serine/Threonine Phosphorylation*

This work has shown that the 14-3-3 proteins participate in a wide range of biologic processes acting through a variety of regulatory mechanisms mediated mostly, and perhaps exclusively, through their binding to phosphoserine-con- taining sequence motifs in diverse partners.

Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence

The GOR algorithm for the protein secondary structure prediction has been modified and improved by using the evolutionary information provided by multiple sequence alignments, adding triplet statistics, and optimizing various parameters.

PhosphoBase, a database of phosphorylation sites: release 2.0

PhosphoBase contains information about phosphorylated residues in proteins and data about peptide phosphorylation by a variety of protein kinases. The data are collected from literature and compiled

Prediction of protein surface accessibility with information theory

A new, simple method based on information theory is introduced to predict the solvent accessibility of amino acid residues in various states defined by their different thresholds, with good prediction accuracy in a jackknife test system.

RefSeq and LocusLink: NCBI gene-centered resources

Together, RefSeq and LocusLink provide a non-redundant view of genes and other loci to support research on genes and gene families, variation, gene expression and genome annotation.