Mathematical programming methods for decentralized POMDPs

Raghav Aras

Thèse Année : 2008

Mathematical programming methods for decentralized POMDPs

Des programmes mathématiques pour les processus décisionnels de Markoff décentralisés et partiellement observés

(1)

Raghav Aras

Fonction : Auteur
PersonId : 787951
IdRef : 131101420

Laboratoire Lorrain de Recherche en Informatique et ses Applications

Résumé

In this thesis, we study the problem of the optimal decentralized control of a partially observed Markov process over a finite horizon. The mathematical model corresponding to the problem is a decentralized POMDP (DEC-POMDP). Many problems in practice from the domains of artificial intelligence and operations research can be modeled as DEC-POMDPs. However, solving a DEC-POMDP exactly is intractable (NEXP-hard). The development of exact algorithms is necessary in order to guide the development of approximate algorithms that can scale to practical sized problems. Existing algorithms are mainly inspired from POMDP research (dynamic programming and forward search) and require an inordinate amount of time for even very small DEC-POMDPs. In this thesis, we develop a new mathematical programming based approach for exactly solving a finite horizon DEC-POMDP. We use the sequence form of a control policy in this approach. Using the sequence form, we show how the problem can be formulated as a mathematical progam with a nonlinear object and linear constraints. We thereby show how this nonlinear program can be linearized to a 0-1 mixed integer linear program (MIP). We present two different 0-1 MIPs based on two different properties of a DEC-POMDP. The computational experience of the mathematical programs presented in the thesis on four benchmark problems (MABC, MA-Tiger, Grid Meeting, Fire Fighting) shows that the time taken to find an optimal joint policy is one or two orders or magnitude lesser than the exact existing algorithms. In the problems tested, the time taken drops from several hours to a few seconds or minutes.

Nous étudions le problème du contrôle optimale décentralisé d'un processus de Markoff partiellement observé sur un horizon fini. Mathématiquement, ce problème se défini comme un DEC-POMDP. Plusieurs problèmes des domaines de l'intélligence artificielles et recherche opérationelles se formalisent comme des DEC-POMDPs. Résoudre un DEC-POMDP dans une mannière exacte est un problème difficile (NEXP-dur). Pourtant, des algorithmes exactes sont importants du point de vue des algorithmes approximés pour résoudre des problèmes pratiques. Les algorithmes existants sont nettement inefficace même pour des DEC-POMDP d'une très petite taille. Dans cette thèse, nous proposons une nouvelle approche basée sur la programmation mathématique. En utilisant la forme séquentielle d'une politique, nous montrons que ce problème peut être formalisé comme un programme non-linéaire. De plus, nous montrons comment transformer ce programme nonl-linéaire un des programmes linéaire avec des variables bivalents et continus (0-1 MIPs). L'éxpérience computationelle sur quatres problèmes DEC-POMDP standards montrent que notre approche trouve une politique optimale beaucoup plus rapidement que des approches existantes. Le temps réduit des heures aux seconds ou minutes.

Mots clés

Panification sur l'incertitude Multi-agents Observabilité partielle Décentalisation Markov Processus de

Domaines

Autre [cs.OH]

Fichier principal

SCD_T_2008_0092_ARAS.pdf (1.12 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Thèses UL : Connectez-vous pour contacter le contributeur

https://hal.univ-lorraine.fr/tel-01748545

Soumis le : jeudi 29 mars 2018-11:37:52

Dernière modification le : lundi 11 septembre 2023-17:41:19

Archivage à long terme le : vendredi 14 septembre 2018-07:37:32

Dates et versions

tel-01748545 , version 1 (29-03-2018)

Identifiants

HAL Id : tel-01748545 , version 1

Citer

Raghav Aras. Mathematical programming methods for decentralized POMDPs. Other [cs.OH]. Université Henri Poincaré - Nancy 1, 2008. English. ⟨NNT : 2008NAN10092⟩. ⟨tel-01748545⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE LORIA THESES-UL

68 Consultations

127 Téléchargements

Mathematical programming methods for decentralized POMDPs

Des programmes mathématiques pour les processus décisionnels de Markoff décentralisés et partiellement observés

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager