Corpus ID: 28807807

Thompson Sampling for Multi-Objective Multi-Armed Bandits Problem

@inproceedings{Yahyaa2015ThompsonSF,
  title={Thompson Sampling for Multi-Objective Multi-Armed Bandits Problem},
  author={Saba Q. Yahyaa and B. Manderick},
  booktitle={ESANN},
  year={2015}
}
  • Saba Q. Yahyaa, B. Manderick
  • Published in ESANN 2015
  • Computer Science
  • The multi-objective multi-armed bandit (MOMAB) problem is a se- quential decision process with stochastic rewards. Each arm generates a vector of rewards instead of a single scalar reward. Moreover, these multiple rewards might be conflicting. The MOMAB-problem has a set of Pareto optimal arms and an agent's goal is not only to find that set but also to play evenly or fairly the arms in that set. To find the Pareto optimal arms, linear scalarized function or Pareto dominance relations can be… CONTINUE READING
    Multi-objective Contextual Multi-armed Bandit With a Dominant Objective
    • 14
    • PDF
    Multi-Objective Generalized Linear Bandits
    • 4
    • PDF
    The biobjective multiarmed bandit: learning approximate lexicographic optimal allocations
    • 1
    • PDF
    Regulating Greed Over Time
    • 4
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-9 OF 9 REFERENCES
    Designing multi-objective multi-armed bandits algorithms: A study
    • 87
    • Highly Influential
    • PDF
    Annealing-pareto multi-objective multi-armed bandit algorithm
    • 14
    • PDF
    Knowledge Gradient for Multi-objective Multi-armed Bandit Algorithms
    • 19
    • PDF
    Performance assessment of multiobjective optimizers: an analysis and review
    • 2,879
    • Highly Influential
    • PDF
    An Empirical Evaluation of Thompson Sampling
    • 808
    • PDF
    Adaptive Scalarization Methods in Multiobjective Optimization
    • 260
    • PDF
    ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES
    • 1,594