Learn More
We have developed a new method for the identification of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequence. The method performs significantly better than previous prediction schemes and can easily be applied on genome-wide data sets. Discrimination between cleaved signal(More)
Artificial neural networks have been applied to the prediction of splice site location in human pre-mRNA. A joint prediction scheme where prediction of transition regions between introns and exons regulates a cutoff level for splice site assignment was able to predict splice site locations with confidence levels far better than previously reported in the(More)
We have developed a new method for the identification of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequences. The method performs significantly better than previous prediction schemes, and can easily be applied to genome-wide data sets. Discrimination between cleaved signal(More)
Artificial neural networks have been combined with a rule based system to predict intron splice sites in the dicot plant Arabidopsis thaliana. A two step prediction scheme, where a global prediction of the coding potential regulates a cutoff level for a local prediction of splice sites, is refined by rules based on splice site confidence values, prediction(More)
When preparing data sets of amino acid or nucleotide sequences it is necessary to exclude redundant or homologous sequences in order to avoid overestimating the predictive performance of an algorithm. For some time methods for doing this have been available in the area of protein structure prediction. We have developed a similar procedure based on pair-wise(More)
We analyse the sequential structure of human exons and their flanking introns by hidden Markov models. Together, models of donor site regions, acceptor site regions and flanked internal exons, show that exons--besides the reading frame--hold a specific periodic pattern. The pattern, which has the consensus: non-T(A/T)G and a minimal periodicity of roughly(More)
In this paper we present a novel method for using the learning ability of a neural network as a measure of information in local regions of input data. Using the method to analyze Escherichia coli promoters, we discover all previously described signals, and furthermore find new signals that are regularly spaced along the promoter region. The spacing of all(More)
A neural network trained to classify the 61 nucleotide triplets of the genetic code into 20 amino acid categories develops in its internal representation a pattern matching the relative cost of transferring amino acids with satisfied backbone hydrogen bonds from water to an environment of dielectric constant of roughly 2.0. Such environments are typically(More)
Human genes are not continuous but rather consist of short coding regions (exons) interspersed with highly variable non-coding regions (introns). We apply HMMs to the problem of modeling ex-ons, introns and detecting splice sites in the human genome. Our most interesting result so far is the detection of particular oscilla-tory patterns, with a minimal(More)
The use of databanks in genetic research assumes reliability of the information they contain. Currently, error-detection in the manually or electronically entered data contained in the nucleotide sequence databanks at EMBL, Heidelberg and GenBank at Los Alamos is limited. We have used a subset of sequences from these databanks to train neural networks to(More)