Ph.D. Thesis, Instituto Superior Técnico, Universidade de Lisboa, Portugal, 2013
Reinforcement learning (RL) is a computational approach which models autonomous agents facing a sequential decision problem in a dynamic environment. The behavior of the agent is guided by a reward mechanism embedded into the agent by its designer. Designing flexible reward mechanisms, capable of guiding the agent in learning the task intended by its designer, is a very demanding endeavor: on one hand, artificial agents have inherent limitations that often impact the ability to actually solve the task they were initially designed to accomplish; On the other hand, traditional approaches to RL are too restrictive given the agents limitations, potentially leading to poor performances. Therefore, applying RL in complex problems often requires a great amount of manual fine-tuning on the agents so that they perform well in a given scenario, and even more when we want them to operate in a variety of different situations, often involving complex interactions with other agents.
In this thesis we adopt a recent framework for intrinsically-motivated reinforcement learning (IMRL) that proposes the use of richer reward signals related to aspects of the agent’s relationship with its environment that may not be directly related with the task intended by its designer. We propose to take inspiration from information processing mechanisms present in natural organisms to build more flexible and robust reward mechanisms for autonomous RL agents. Specifically, we focus on the role of emotions as an evolutionary adaptive mechanism and also on the way individuals interact and cooperate with each other as a social group.
In a series of experiments, we show that the adaptation of emotion-based signals for the design of rewards within IRML allows us to achieve general-purpose solutions and at the same time alleviate some of the agent’s inherent limitations. We also show that social groups of IMRL agents, endowed with a reward mechanism inspired by the way humans and other animals exchange signals between each other, end up maximizing their collective fitness by promoting socially-aware behaviors. Furthermore, by emerging reward signals having dynamic and structural properties that relate to emotions and the way they evolved in nature, we show that emotion-based design might have a greater impact for the adaptation of artificial agents than thought before.
Overall, our results support the claim that, by providing the agents with reward mechanisms inspired by the way that emotions and social mechanisms evaluate and structure natural organ- isms’ interactions with their environment, we provide agent designers with a flexible and robust reward design principle that is able to overcome common limitations inherent to RL agents.