Learn More
A conventional neural network approach to regression problems approximates the conditional mean of the output vector. For mappings which are multi-valued this approach breaks down, since the average of two solutions is not necessarily a valid solution. In this article Mixture Density Networks, a principled method for modelling conditional probability(More)
We introduce a flexible visual data mining framework which combines advanced projection algorithms from the machine learning domain and visual techniques developed in the information visualization domain. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection algorithms, such(More)
In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference procedure. We use a Gaussian process with hyper-parameters(More)
This ongoing research, funded by the NHS through its New and Emerging Applications of Technology, aims to improve mental-health risk assessments by developing a universally accessible computeris ed decision support system. The system will contain a risk-screening tool that records client data (cues) and provides risk estimates for suicide, self-harm,(More)
Predicting the log of the partition coefficient P is a long-standing benchmark problem in Quantitative Structure-Activity Relationships (QSAR). In this paper we show that a relatively simple molecular representation (using 14 variables) can be combined with leading edge machine learning algorithms to predict logP on new compounds more accurately than(More)
Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how(More)
A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical(More)