Bioinformatics: Organisms from Venus, Technology from Jupiter, Algorithms from Mars

  title={Bioinformatics: Organisms from Venus, Technology from Jupiter, Algorithms from Mars},
  author={Bart De Moor and Kathleen Marchal and Janick Mathys and Yves Moreau},
  journal={Eur. J. Control},
In this paper, we discuss data sets that are being generated by microarray technology, which makes it possible to measure in parallel the activity or expression of thousands of genes simultaneously. We discuss the basics of the technology, how to preprocess the data, and how classical and newly developed algorithms can be used to generate insight in the biological processes that have generated the data. Algorithms we discuss are Principal Component Analysis, clustering techniques such as… 

Multivariate Statistical Tools for the Evaluation of Proteomic 2D-maps:Recent Achievements and Applications

This review describes and reports the most recent achievements in the field of statistical tools applied to proteome research by two-dimensional gel electrophoresis (2D-GE) and describes the theoretical aspects of the multivariate methods adopted in this field.


The use of data-mining techniques in such a context of paradigmatic classification problem of two kinds of Leukaemia is discussed, with particular attention to the classification methods and all the data analysis steps including data pre-processing and information retrieval.


In this paper, starting from a paradigmatic classification problem of two kinds of Leukaemia, the use of data-mining techniques in such a context is discussed, with particular attention to the classification methods.

An unsupervised clustering approach for leukaemia classification based on DNA micro-arrays data

In this paper, starting from a paradigmatic classification problem of two kinds of Leukaemia, the use of data-mining techniques in such a context is discussed, with particular attention to the classification method.

On the identification of sparse gene regulatory networks

A linear dynamical model structure is introduced to describe the gene interactions in a sparse gene regulatory network, which generalizes the set-up in [9]. Techniques from robust statistics based on

50 years of data mining and OR: upcoming trends and challenges

A series of upcoming trends and challenges for data mining and its role within operational research (OR) are outlined, including linear and quadratic optimization, genetic algorithms and concepts based on artificial ant colonies.

Clustering of Pancreatic Endocrine Tumors Via Microarray Gene Expression Analysis

A simple, multivariable and linearly initial- ized clustering is shown to be able to deal with unsu- pervised classification of the data originating from pan- creatic endocrine tumors (PET). Results

Computational biology and toxicogenomics

A reliable screening of drug candidates on toxicological side effects in early stages of the lead component development can help in prioritizing candidates and avoiding the futile use of expensive clinical trials and animal tests.

Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis

The LS-SVM formulation has clear primal-dual interpretations, and without the bias term, one explicitly constructs a model that yields the same expressions as have been obtained with GPs for regression, which has advantages for deriving analytic expressions in a Bayesian evidence framework.



Functional bioinformatics of microarray data: from expression to regulation

This work integrates clustering of coexpressed genes with the discovery of binding motifs and presents a clustering algorithm (called adaptive quality-based clustering), which is developed to address several shortcomings of existing methods.

A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes

A modification to the original Gibbs Sampling algorithm is presented, introducing a probability distribution to estimate the number of copies of the motif in a sequence and the incorporation of a higher-order background model.

Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies.

We present here a simple and fast method allowing the isolation of DNA binding sites for transcription factors from families of coregulated genes, with results illustrated in Saccharomyces

A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling

It is shown that the use of a higher-order model considerably enhances the performance of the motif finding algorithm in the presence of noisy data.

Adaptive quality-based clustering of gene expression profiles

A novel adaptive quality-based clustering algorithm that tackles some of the drawbacks of classical clustering algorithms and derives an optimal radius of the cluster so that only the significantly coexpressed genes are included in the cluster.

Array of hope

The genome sequences have not only made a new era of exploration imperative, but, providentially, they have also made it possible to take a fresh, comprehensive and open-minded look at every question in biology.

Preprocessing implementation for microarray (PRIM): an efficient method for processing cDNA microarray data.

A data processing method that very efficiently extracts reproducible data from the result of duplicate experiments, designed to automatically filter the raw results obtained from cDNA microarray image-analysis software.

Systems Biology: the Reincarnation of Systems Theory Applied in Biology?

The domain of systems theory, its application to biology and the lessons that can be learned from the work of Robert Rosen are reviewed.

A web site for the computational analysis of yeast regulatory sequences

A series of computer programs were developed for the analysis of regulatory sequences, with a special focus on yeast, that provides a series of general utilities, such as generation of random sequence, automatic drawing of XY graphs, interconversions between sequence formats, etc.

Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

  • U. AlonN. Barkai A. Levine
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1999
A two-way clustering method is reported for analyzing a data set consisting of the expression patterns of different cell types, revealing broad coherent patterns that suggest a high degree of organization underlying gene expression in these tissues.