Gilles Caraux

Learn More
PermutMatrix is a work space designed to graphically explore gene expression data. It relies on the graphical approach introduced by Eisen and also offers several methods for the optimal reorganization of rows and columns of a numerical dataset. For example, several methods are proposed for optimal reorganization of the leaves of a hierarchical clustering(More)
High-throughput sequencing technologies offer new perspectives for biomedical, agronomical and evolutionary research. Promising progresses now concern the application of these technologies to large-scale studies of genetic variation. Such studies require the genotyping of high numbers of samples. This is theoretically possible using 454 pyrosequencing,(More)
In the nonparametric discrimination problem, one observes X, a’random vector with values in wd, and wishes to estimate 8, a random variable known to take values in { 1; * * ,M}. All that is known about the distribution of (X,fZ) is that which can be inferred from a sample (X,,e,); . ,(X,,&) drawn from the distribution of (X,0). The sample, denoted by D,,,(More)
Two-dimensional gel electrophoresis, a routine application in proteomics, separates proteins according to their molecular mass (M(r)) and isoelectric point (pI). As the genomic sequences for more and more organisms are determined, the M(r) and pI of all their proteins can be estimated computationally. The examination of several of these theoretical proteome(More)
Gascuel, O. and G. Caraux, Distribution-free performance bounds with the resubstitution error estimate, Pattern Recognition Letters 13 (1992) 757-764. Two distribution-free upper bounds are given for the true error rate of a classifier, using the resubstitution error estimate. These bounds apply when the classifier is selected from a finite decision rule(More)
We consider the problem of phylogenetic reconstruction, which consists in estimating the evolutionary history of a set of species. This unknown history is modelled as a tree and estimated from nucleotide sequences taken from the species’ genome. The rst goal of the estimation is to produce a tree which is structurally as close as possible to the true tree.(More)
Inductive learning systems search for regularities that therefore be applied with some assurance to an example describe environmental observations, These systems often use which does not belong to the learning set. In other numeri~~l heu~stics to guide this search, The~ also sele~t words, statistical significance may be used to assess regulantles which are(More)
Two methods are commonly employed for evaluating the extent of the uncertainty of evolutionary distances between sequences: either some estimator of the variance of the distance estimator, or the bootstrap method. However, both approaches can be misleading, particularly when the evolutionary distance is small. We propose using another statistical method(More)
We have developed a specialised proteomic database for the analysis of matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) data derived from tryptic peptides of Sinorhizobium meliloti proteins. This database currently contains the amino acid sequence data of the proteins predicted from the complete chromosome,(More)