Learn More
Ab initio protein structure prediction methods generate numerous structural candidates, which are referred to as decoys. The decoy with the most number of neighbors of up to a threshold distance is typically identified as the most representative decoy. However, the clustering of decoys needed for this criterion involves computations with runtimes that are(More)
The pattern languages are languages that are generated from patterns, and were first proposed by Angluin as a nontrivial class that is inferable from positive data [D. Angluin, Finding patterns common to a set of strings, Journal of Computer and System Sciences 21 (1980) 46–62; D. Angluin, Inductive inference of formal languages from positive data,(More)
In this paper we examine the issues involved in finding consensus patterns from biosequence data of very small sample sizes, by searching for so-called minimal multiple generalization (mmg), that is, a set of syntactically minimal patterns that accounts for all the samples. The data we use are the sigma regulons with more conserved consensus patterns for(More)
A tree pattern p is a first-order term in formal logic, and the language of p is the set of all the tree patterns obtainable by replacing each variable in p with a tree pattern containing no variables. We consider the inductive inference of the unions of these languages from positive examples using strategies that guarantee some forms of minimality during(More)
Co-expressing genes tend to cluster in eukaryotic genomes. This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation. The analyses show that neighbouring genes are significantly coexpressed in the zebrafish(More)
We consider the problem of finding a set of patterns that best characterizes a set of strings. To this end, Arimura et. al. [3] considered the use of minimal multiple generalizations (mmg) for such characterizations. Given any sample set, the mmgs are, roughly speaking, the most (syntactically) specific set of languages containing the sample within a given(More)
In this paper we consider a basic clustering problem that has uses in bioinformatics. A structural fragment is a sequence of ` points in a 3D space, where ` is a fixed natural number. Two structural fragments f1 and f2 are equivalent if and only if f1 = f2 · R + τ under some rotation R and translation τ . We consider the distance between two structural(More)