Corpus ID: 215416047

Solving the scalarization issues of Advantage-based Reinforcement Learning Algorithms

@article{Galatolo2020SolvingTS,
  title={Solving the scalarization issues of Advantage-based Reinforcement Learning Algorithms},
  author={Federico A. Galatolo and Mario G. C. A. Cimino and Gigliola Vaglini},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.04120}
}
  • Federico A. Galatolo, Mario G. C. A. Cimino, Gigliola Vaglini
  • Published 2020
  • Mathematics, Computer Science
  • ArXiv
  • In this paper we investigate some of the issues that arise from the scalarization of the multi-objective optimization problem in the Advantage Actor Critic (A2C) reinforcement learning algorithm. We show how a naive scalarization leads to gradients overlapping and we also argue that the entropy regularization term just inject uncontrolled noise into the system. We propose two methods: one to avoid gradient overlapping (NOG) but keeping the same loss formulation; and one to avoid the noise… CONTINUE READING

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 11 REFERENCES

    A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

    VIEW 1 EXCERPT

    Massively Parallel Hyperparameter Tuning

    VIEW 3 EXCERPTS
    HIGHLY INFLUENTIAL

    Acktr & a2c

    • OpenAI
    • 2017.
    • 2017
    VIEW 4 EXCERPTS
    HIGHLY INFLUENTIAL

    Algorithms for hyperparameter optimization

    • J. S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl
    • Advances in neural information processing systems, pp. 2546–2554, 2011.
    • 2011
    VIEW 2 EXCERPTS
    HIGHLY INFLUENTIAL