Corpus ID: 28807807

Thompson Sampling for Multi-Objective Multi-Armed Bandits Problem

@inproceedings{Yahyaa2015ThompsonSF,
  title={Thompson Sampling for Multi-Objective Multi-Armed Bandits Problem},
  author={Saba Q. Yahyaa and B. Manderick},
  booktitle={ESANN},
  year={2015}
}
  • Saba Q. Yahyaa, B. Manderick
  • Published in ESANN 2015
  • Computer Science
  • The multi-objective multi-armed bandit (MOMAB) problem is a se- quential decision process with stochastic rewards. Each arm generates a vector of rewards instead of a single scalar reward. Moreover, these multiple rewards might be conflicting. The MOMAB-problem has a set of Pareto optimal arms and an agent's goal is not only to find that set but also to play evenly or fairly the arms in that set. To find the Pareto optimal arms, linear scalarized function or Pareto dominance relations can be… CONTINUE READING
    Multi-objective Contextual Multi-armed Bandit With a Dominant Objective
    • 13
    • Open Access
    Multi-Objective contextual bandits with a dominant objective
    • 6
    • Open Access
    Multi-Objective Generalized Linear Bandits
    • 4
    • Open Access
    Regulating Greed Over Time
    • 4
    • Open Access
    The biobjective multiarmed bandit: learning approximate lexicographic optimal allocations
    • 1
    • Open Access

    References

    Publications referenced by this paper.
    SHOWING 1-9 OF 9 REFERENCES
    Performance assessment of multiobjective optimizers: an analysis and review
    • 2,862
    • Highly Influential
    • Open Access
    Designing multi-objective multi-armed bandits algorithms: A study
    • 86
    • Highly Influential
    • Open Access
    An Empirical Evaluation of Thompson Sampling
    • 806
    • Open Access
    Annealing-pareto multi-objective multi-armed bandit algorithm
    • 14
    • Open Access
    Adaptive Scalarization Methods in Multiobjective Optimization
    • 258
    • Open Access
    Knowledge Gradient for Multi-objective Multi-armed Bandit Algorithms
    • 19
    • Open Access
    ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES
    • 1,592