Martin Simonsen

Learn More
MOTIVATION Protein-protein interactions (PPIs) are pivotal for many biological processes and similarity in Gene Ontology (GO) annotation has been found to be one of the strongest indicators for PPI. Most GO-driven algorithms for PPI inference combine machine learning and semantic similarity techniques. We introduce the concept of inducers as a method to(More)
MOTIVATION Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming(More)
The objective in molecular docking is to determine the best binding mode of two molecules in silico. A common application of molecular docking is in drug discovery where a large number of ligands are docked against a protein to identify potential drug candidates. This is a computationally intensive problem especially if flexibility of the molecules are(More)
The neighbour-joining method by Saitou and Nei is a widely used method for phylogenetic reconstruction, made popular by a combination of computational efficiency and reasonable accuracy. With its cubic running time by Studier and Kepler, the method scales to hundreds of species, and while it is usually possible to infer phylogenies with thousands of(More)
Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality. In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied(More)
Distance estimators are needed as input for popular distance based phylogenetic reconstruction methods such as UPGMA and neighbour-joining. Computation of these takes <i>O</i>(<i>n</i><sup>2</sup><i>l</i>) time for <i>n</i> sequences with length <i>l</i> which is usually fast compared to reconstructing a phylogenetic tree of <i>n</i> taxa. However, with the(More)
  • 1