Learn More
Single nucleotide polymorphism (SNP) studies and random mutagenesis projects identify amino acid substitutions in protein-coding regions. Each substitution has the potential to affect protein function. SIFT (Sorting Intolerant From Tolerant) is a program that predicts whether an amino acid substitution affects protein function so that users can prioritize(More)
The effect of genetic mutation on phenotype is of significant interest in genetics. The type of genetic mutation that causes a single amino acid substitution (AAS) in a protein sequence is called a non-synonymous single nucleotide polymorphism (nsSNP). An nsSNP could potentially affect the function of the protein, subsequently altering the carrier's(More)
MOTIVATION As databanks grow, sequence classification and prediction of function by searching protein family databases becomes increasingly valuable. The original Blocks Database, which contains ungapped multiple alignments for families documented in Prosite, can be searched to classify new sequences. However, Prosite is incomplete, and families from other(More)
Nonsynonymous single nucleotide polymorphisms (nsSNPs) are coding variants that introduce amino acid changes in their corresponding proteins. Because nsSNPs can affect protein function, they are believed to have the largest impact on human health compared with SNPs in other regions of the genome. Therefore, it is important to distinguish those nsSNPs that(More)
Centromeric H3-like histones, which replace histone H3 in the centromeric chromatin of animals and fungi, have not been reported in plants. We identified a histone H3 variant from Arabidopsis thaliana that encodes a centromere-identifying protein designated HTR12. By immunological detection, HTR12 localized at centromeres in both mitotic and meiotic cells.(More)
Each column of amino acids in a multiple alignment of protein sequences can be represented as a vector of 20 amino acid counts. For alignment and searching applications, the count vector is an imperfect representation of a position, because the observed sequences are an incomplete sample of the full set of related sequences. One general solution to this(More)
Centromeres are the last frontiers of complex eukaryotic genomes, consisting of highly repetitive sequences that resist mapping, cloning and sequencing. The centromere of rice Chromosome 8 (Cen8) has an unusually low abundance of highly repetitive satellite DNA, which allowed us to determine its sequence. A region of approximately 750 kb in Cen8 binds rice(More)
The Blocks Database WWW (http://blocks.fhcrc.org ) and Email (blocks@blocks.fhcrc.org ) servers provide tools to search DNA and protein queries against the Blocks+ Database of multiple alignments, which represent conserved protein regions. Blocks+ nearly doubles the number of protein families included in the database by adding families from the Pfam-A,(More)
The Sorting Intolerant from Tolerant (SIFT) algorithm predicts the effect of coding variants on protein function. It was first introduced in 2001, with a corresponding website that provides users with predictions on their variants. Since its release, SIFT has become one of the standard tools for characterizing missense variation. We have updated SIFT's(More)
A program has been developed that provides molecular biologists with multiple tools for searching databases, yet uses a very simple interface. PATMAT can use protein or (translated) DNA sequences, patterns or blocks of aligned proteins as queries of databases consisting of amino acid or nucleotide sequences, patterns or blocks. The ability to search(More)