Learn More
The ARB (from Latin arbor, tree) project was initiated almost 10 years ago. The ARB program package comprises a variety of directly interacting software tools for sequence database maintenance and analysis which are controlled by a common graphical user interface. Although it was initially designed for ribosomal RNA data, it can be used for any nucleic and(More)
Tubulins are still considered as typical proteins of Eukaryotes. However, more recently they have been found in the unusual bacteria Prosthecobacter (btubAB). In this study, the genomic organization of the btub-genes and their genomic environment were characterized by using the newly developed Two-Step Gene Walking method. In all investigated(More)
A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for(More)
MOTIVATION We tackle the problem of finding regularities in microarray data. Various data mining tools, such as clustering, classification, Bayesian networks and association rules, have been applied so far to gain insight into gene-expression data. Association rule mining techniques used so far work on discretizations of the data and cannot account for(More)
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions,(More)
We tackle the problem of finding association rules for quantitative data. Whereas most of the previous approaches operate on hyperrectangles, we propose a representation based on half-spaces. Consequently, the left-hand side and right-hand side of an association rule does not contain a conjunction of items or intervals, but a weighted sum of variables(More)
We present a new and comprehensive approach to inductive databases in the relational model. The main contribution is a new in-ductive query language extending SQL, with the goal of supporting the whole knowledge discovery process, from pre-processing via data mining to post-processing. A prototype system supporting the query language was developed in the(More)
We address the problem of learning a predictive model for growth inhibition from the NCI DTP human tumor cell line screening data. Extending the classical Quantitative Structure Activity Relationship paradigm, we investigate whether including gene expression data leads to a statistically significant improvement of prediction quality. Our analysis shows that(More)
In the demonstration, we will present the concepts and an implementation of an <i>inductive database</i> -- as proposed by Imielinski and Mannila -- in the relational model. The goal is to support all steps of the knowledge discovery process, from pre-processing via data mining to post-processing, on the basis of queries to a database system. The query(More)