The Prison Guard Training Game

A case study of Temporal Difference Learning Provoking Cooperative Behavior in Multi-Agent Systems