• Corpus ID: 13205570

Retrieval of Experiments with Sequential Dirichlet Process Mixtures in Model Space

@article{Dutta2013RetrievalOE,
  title={Retrieval of Experiments with Sequential Dirichlet Process Mixtures in Model Space},
  author={Ritabrata Dutta and Sohan Seth and Samuel Kaski},
  journal={ArXiv},
  year={2013},
  volume={abs/1310.2125}
}
We address the problem of retrieving relevant experiments given a query experiment, motivated by the public databases of datasets in molecular biology and other experimental sciences, and the need of scientists to relate to earlier work on the level of actual measurement data. Since experiments are inherently noisy and databases ever accumulating, we argue that a retrieval engine should possess two particular characteristics. First, it should compare models learnt from the experiments rather… 

Figures from this paper

Retrieval of Experiments by Efficient Comparison of Marginal Likelihoods
TLDR
This work argues that a retrieval metric is a sensible measure of similarity between two experiments since it permits inclusion of experiment-specific prior knowledge and demonstrates the efficacy of this approach on simulated data with simple linear regression as the models, and real world datasets.
Retrieval of Experiments by Efficient Estimation of Marginal Likelihood
TLDR
This work argues that a retrieval metric is a sensible measure of similarity between two experiments since it permits inclusion of experiment-specific prior knowledge and demonstrates the efficacy of this approach on simulated data with simple linear regression as the models, and real world datasets.

References

SHOWING 1-10 OF 23 REFERENCES
Variational inference for Dirichlet process mixtures
TLDR
A variational inference algorithm forDP mixtures is presented and experiments that compare the algorithm to Gibbs sampling algorithms for DP mixtures of Gaussians and present an application to a large-scale image analysis problem are presented.
Multi-Task Learning for Classification with Dirichlet Process Priors
TLDR
Experimental results on two real life MTL problems indicate that the proposed algorithms automatically identify subgroups of related tasks whose training data appear to be drawn from similar distributions are more accurate than simpler approaches such as single-task learning, pooling of data across all tasks, and simplified approximations to DP.
Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems
process. This paper extends Ferguson's result to cases where the random measure is a mixing distribution for a parameter which determines the distribution from which observations are made. The
Fast Bayesian Inference in Dirichlet Process Mixture Models
  • Lianming Wang, D. Dunson
  • Computer Science
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2011
TLDR
This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models, viewing the partitioning of subjects into clusters as a model selection problem, and proposes a sequential greedy search algorithm for selecting the partition.
Sequential Monte Carlo Samplers for Dirichlet Process Mixtures
TLDR
The proposed algorithm is a particular SMC sampler that enables us to design sophisticated clustering update schemes, such as updating past trajectories of the particles in light of recent observations, and still ensures convergence to the true DPM target distribution asymptotically.
Particle filters for mixture models with an unknown number of components
TLDR
The performance of this particle filter, when analyzing both simulated and real data from a Gaussian mixture model, is uniformly better than the particle filter algorithm of Chen and Liu, and in many situations it outperforms a Gibbs Sampler.
Particle learning for general mixtures
TLDR
This paper develops particle learning (PL) methods for the estimation of general mixture models and shows that PL leads to straightforward tools for marginal likelihood calculation and posterior cluster allocation.
A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model
TLDR
A split-merge Markov chain algorithm is proposed to address the problem of inefficient sampling for conjugate Dirichlet process mixture models by employing a new technique in which an appropriate proposal for splitting or merging components is obtained by using a restricted Gibbs sampling scan.
A Probabilistic Model for Online Document Clustering with Application to Novelty Detection
TLDR
A probabilistic model for online document clustering using non-parametric Dirichlet process prior to model the growing number of clusters, and using a prior of general English language model as the base distribution to handle the generation of novel clusters.
A Bayesian Analysis of Some Nonparametric Problems
Bayesian approach remained rather unsuccessful in treating nonparametric problems. This is primarily due to the difficulty in finding workable prior distribution on the parameter space , which in
...
...