#### Filter Results:

#### Publication Year

2010

2016

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

MOTIVATION
As an increasing number of genome-wide association studies reveal the limitations of the attempt to explain phenotypic heritability by single genetic loci, there is a recent focus on associating complex phenotypes with sets of genetic loci. Although several methods for multi-locus mapping have been proposed, it is often unclear how to relate the… (More)

In this paper we integrate two essential processes, discretization of continuous data and learning of a model that explains them, towards fully computational machine learning from continuous data. Discretization is fundamental for machine learning and data mining, since every continuous datum ; e.g., a real-valued datum obtained by observation in the real… (More)

We present a method for finding all subgraphs whose occurrence is significantly enriched in a particular class of graphs while correcting for multiple testing. Although detecting such significant subgraphs is a crucial step for further analysis across application domains, multiple testing of subgraphs has not been investigated before as it is not only… (More)

Random walk kernels measure graph similarity by counting matching walks in two graphs. In their most popular form of geometric random walk kernels, longer walks of length k are downweighted by a factor of λ k (λ < 1) to ensure convergence of the corresponding geometric series. We know from the field of link prediction that this downweighting often leads to… (More)

Distance-based approaches to outlier detection are popular in data mining, as they do not require to model the underlying probability distribution, which is particularly challenging for high-dimensional data. We present an empirical comparison of various approaches to distance-based outlier detection across a large number of datasets. We report the… (More)

- Felipe Llinares-López, Dominik G. Grimm, Dean A. Bodenham, Udo Gieraths, Mahito Sugiyama, Beth Rowan +1 other
- Bioinformatics
- 2015

MOTIVATION
Genetic heterogeneity, the fact that several sequence variants give rise to the same phenotype, is a phenomenon that is of the utmost interest in the analysis of complex phenotypes. Current approaches for finding regions in the genome that exhibit genetic heterogeneity suffer from at least one of two shortcomings: (i) they require the definition… (More)

We present learning of figures, nonempty compact sets in Euclidean space, based on Gold's learning model aiming at a computable foundation for binary classification of multivariate data. Encoding real vectors with no numerical error requires infinite sequences, resulting in a gap between each real vector and its dis-cretized representation used for the… (More)

Goal • Given multiple networks • Find features (vertices), which are associated with the target response and tend to be connected each other 2/23

—We build information geometry for a partially ordered set of variables and define orthogonal decomposition of information theoretic quantities. The natural connection between information geometry and order theory leads to efficient decomposition algorithms. It is a generalization of Amari's seminal work on hierarchical decomposition of probability… (More)

- Mahito Sugiyama, Niklas Kasenburg, Karsten Borgwardt, Eth Zürich
- 2015