Search
Now showing items 1-2 of 2
Potential-Based Reward Shaping Preserves Pareto Optimal Policies
(2017-05)
Reward shaping is a well-established family of techniques
that have been successfully used to improve the performance
and learning speed of Reinforcement Learning agents in singleobjective
problems. Here we extend the ...
Policy Invariance under Reward Transformations for Multi-Objective Reinforcement Learning
(2017)
Reinforcement Learning (RL) is a powerful and well-studied Machine Learning
paradigm, where an agent learns to improve its performance in an environment
by maximising a reward signal. In multi-objective Reinforcement
Learning ...