• Publications
  • Influence
Knowledge Discovery in Multi-label Phenotype Data
TLDR
This work uses KDD to analyse data from mutant phenotype growth experiments with the yeast S. cerevisiae to predict novel gene functions, and learns rules which are accurate and biologically meaningful.
Statistical Evaluation of the Predictive Toxicology Challenge 2000-2001
TLDR
A statistical method was developed to test if a model performs significantly better than random in ROC space, and the best predictor was the Viniti model for female mice, with p value below 0.002.
Functional genomic hypothesis generation and experimentation by a robot scientist
TLDR
A physically implemented robotic system that applies techniques from artificial intelligence to carry out cycles of scientific experimentation and shows that an intelligent experiment selection strategy is competitive with human performance and significantly outperforms, with a cost decrease of 3-fold and 100-fold, both cheapest and random-experiment selection.
The Predictive Toxicology Challenge 2000-2001
TLDR
A challenge to predict the rodent carcinogenicity of new compounds based on the experimental results of the US National Toxicology Program (NTP) to stimulate the development of advanced SAR techniques for predictive toxicology models.
Hierarchical metabolomics demonstrates substantial compositional similarity between genetically modified and conventional potato crops.
TLDR
A comprehensive comparison of total metabolites in field-grown GM and conventional potato tubers using a hierarchical approach initiating with rapid metabolome " fingerprinting" to guide more detailed profiling of metabolites where significant differences are suspected.
The Automation of Science
TLDR
The development of Robot Scientist “Adam,” which has autonomously generated functional genomics hypotheses about the yeast Saccharomyces cerevisiae and experimentally tested these hypotheses by using laboratory automation, is reported.
An ontology of scientific experiments
TLDR
The proposed ontology EXPO links the SUMO (the Suggested Upper Merged Ontology) with subject-specific ontologies of experiments by formalizing the generic concepts of experimental design, methodology and results representation.
Active Learning for Regression Based on Query by Committee
TLDR
A committee-based approach for active learning of real-valued functions is investigated, which is a variance-only strategy for selection of informative training data and shows to suffer when the model class is misspecified since the learner's bias is high.
Finding Frequent Substructures in Chemical Compounds
TLDR
This paper applies data mining to the problem of predicting chemical carcinogenicity, and presents a knowledge discovery method for structured data, where patterns reflect the one- to-many and many-to-many relationships of several tables.
...
...