Stefan J. Witwicki and Edmund H. Durfee. In Proceedings of the 4th Worshop on Multi-agent Sequential Decision-Making in Uncertain Domains (MSDM-2009) at AAMAS, pages 65-72. Budapest, Hungary. May 2009.
|
Policy optimization in Decentralized MDPs is in general intractable, so researchers have developed a variety of techniques geared towards solving restricted subclasses of these problems more efficiently. In particular, Becker and colleagues have identified an interesting class wherein agents' interactions consist of event-driven dependencies, and have applied informed policy search techniques to solve these problems optimally. Here we present a dual formulation of the Event-driven DEC-MDP, representing interactions with a well-established commitment paradigm. We claim that, for this particular class of problems, searching a space of rich commitments is equivalent to searching the policy space directly. And we argue that our commitment-based reformulation not only enables more efficient, scalable computation of approximate solutions, but further provides a natural flexibility of approximation by which interactions can be modeled with more or less detail. |
|