• Corpus ID: 26033024

An Optimal Bayesian Network Based Solution Scheme for the Constrained Stochastic On-line Equi-Partitioning Problem

  title={An Optimal Bayesian Network Based Solution Scheme for the Constrained Stochastic On-line Equi-Partitioning Problem},
  author={Sondre Glimsdal and Ole-Christoffer Granmo},
A number of intriguing decision scenarios revolve around partitioning a collection of objects to optimize some application specific objective function. This problem is generally referred to as the Object Partitioning Problem (OPP) and is known to be NP-hard. We here consider a particularly challenging version of OPP, namely, the Stochastic On-line Equi-Partitioning Problem (SO-EPP). In SO-EPP, the target partitioning is unknown and has to be inferred purely from observing an on-line sequence of… 



Annealed MAP

The ANNEALEDMAP algorithm is proposed, a simulated annealing-based MAP algorithm that simulates a non-homogeneous Markov chain whose invariant function is a probability density that concentrates itself on the modes of the target density.

Accelerated Bayesian learning for decentralized two-armed bandit based decision making with applications to the Goore Game

A novel scheme for decentralized decision making based on the Goore Game in which each decision maker is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling conjugate priors, and on random sampling from these posteriors is proposed.

Analysis of Thompson Sampling for the Multi-armed Bandit Problem

For the first time, it is shown that Thompson Sampling algorithm achieves logarithmic expected regret for the stochastic multi-armed bandit problem.

Gaussian Process Based Optimistic Knapsack Sampling with Applications to Stochastic Resource Allocation

Gaussian Process based Optimistic Knapsack Sampling (GPOKS) is introduced, a novel model-based reinforcement learning scheme for solving stochastic fractional knapsack problems, founded on Gaussian Process (GP) enabled Optimistic Thompson Samplings (OTS).

A Game Theoretic Approach for Simultaneous Compaction and Equipartitioning of Spatial Data Sets

This paper proposes a novel approach for spatial data clustering based on the concepts of microeconomic theory, which can simultaneously optimize both the compaction and the equipartitioning objectives and yields a high value of fairness.

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems

The focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs.

Thompson Sampling for Contextual Bandits with Linear Payoffs

A generalization of Thompson Sampling algorithm for the stochastic contextual multi-armed bandit problem with linear payoff functions, when the contexts are provided by an adaptive adversary is designed and analyzed.

Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton

Research is reported into a completely new family of solution schemes for the TABB problem: the Bayesian learning automaton (BLA) family, based upon merely counting rewards/penalties, combined with random sampling from a pair of twin Beta distributions.

Further Optimal Regret Bounds for Thompson Sampling

A novel regret analysis for Thompson Sampling is provided that proves the first near-optimal problem-independent bound of O( √ NT lnT ) on the expected regret of this algorithm, and simultaneously provides the optimal problem-dependent bound.