Pedro Sequeira, Francisco S. Melo and Ana Paiva
GAIPS technical report series, GAIPS-TR-001-12, March 2012
Reinforcement learning agents have inherent limitations and pose some design challenges that, under certain circumstances, may have an impact on their autonomy and flexibility. A recent frame- work for intrinsically-motivated reinforcement learning proposes the existence of intrinsic reward functions that, if used by the agent during learning, have the potential to improve its performance when evaluated according to its designer’s goals. Such functions map features of the agent’s history of interaction with its environment into scalar reward values. In this paper, we pro- pose a set of reward features based on four common dimensions of emotional appraisal that, similarly to what occurs in biological agents, evaluate the significance of several aspects of the agent’s history of interaction with its environment. Our experiments in several foraging scenarios show that, by optimizing the relative contributions of each feature for a set of environments of interest, emotion-based reward functions enable better performances when compared to more standard goal-oriented reward functions, particularly in the presence of agent limitations. The results support our claim that reward functions inspired on biological evolutionary adaptive mechanisms (as emotions are) have the potential to provide more autonomy to learning agents and great flexibility in reward design, while alleviating some limitations inherent to artificial agents.