Corpus ID: 226956159

Bayesian nonparametric modelling of sequential discoveries

  title={Bayesian nonparametric modelling of sequential discoveries},
  author={Alessandro Zito and Tommaso Rigon and Otso Ovaskainen and David B. Dunson},
  journal={arXiv: Methodology},
We aim at modelling the appearance of distinct tags in a sequence of labelled objects. Common examples of this type of data include words in a corpus or distinct species in a sample. These sequential discoveries are often summarised via accumulation curves, which count the number of distinct entities observed in an increasingly large set of objects. We propose a novel Bayesian nonparametric method for species sampling modelling by directly specifying the probability of a new discovery… Expand

Figures and Tables from this paper


Bayesian nonparametric inference for species variety with a two parameter Poisson-Dirichlet process prior
A Bayesian nonparametric methodology has been recently proposed in order to deal with the issue of prediction within species sampling problems. Such problems concern the evaluation, conditional on aExpand
A new estimator of the discovery probability.
A novel estimator of the probability of detecting species that have been observed with any given frequency in the enlarged sample of size n+m is derived and the result allows us to quantify both the rate at which rare species are detected and the achieved sample coverage of abundant species, as m increases. Expand
Bayesian nonparametric inference beyond the Gibbs-type framework
The definition and investigation of general classes of nonparametric priors has recently been an active research line in Bayesian statistics. Among the various proposals, the Gibbs‐type family, whichExpand
Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?
The goal of this paper is to provide a systematic and unified treatment of Gibbs–type priors and highlight their implications for Bayesian nonparametric inference. Expand
The Pitman–Yor multinomial process for mixture modelling
Discrete nonparametric priors play a central role in a variety of Bayesian procedures, most notably when used to model latent features as in clustering, mixtures and curve fitting. They are effectiveExpand
Nonparametric prediction in species sampling
A simple prediction method is proposed for predicting the number of new species that would be discovered by additional sampling in a continuous-time stochastic model in which species arrive in the sample according to independent Poisson processes and where the species discovery rates are heterogeneous. Expand
Bayesian Nonparametric Estimation of the Probability of Discovering New Species
We consider the problem of evaluating the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basicExpand
Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables
We propose a new data-augmentation strategy for fully Bayesian inference in models with binomial likelihoods. The approach appeals to a new class of Pólya–Gamma distributions, which are constructedExpand
The Dependent Dirichlet Process and Related Models
Standard regression approaches assume that some finite number of the response distribution characteristics, such as location and scale, change as a (parametric or nonparametric) function ofExpand
Defining Predictive Probability Functions for Species Sampling Models.
This paper gives a new necessary and sufficient condition for arbitrary putative PPFs to define an EPPF and shows posterior inference for a large class of SSMs with a PPF that is not linear in cluster size and discusses a numerical method to derive its PPF. Expand