Skip to main content

2024 | OriginalPaper | Buchkapitel

A Meta-MDP Approach for Information Gathering Heterogeneous Multi-agent Systems

verfasst von : Alvin Gandois, Abdel-Illah Mouaddib, Simon Le Gloannec, Ayman Alfalou

Erschienen in: Robotics, Computer Vision and Intelligent Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we address the problem of heterogeneous multi-robot cooperation for information gathering and situation evaluation in a stochastic and partially observable environment. The goal is to optimally gather information about targets in the environment with several robots having different capabilities. The classical Dec-POMDP framework is a good tool to compute an optimal joint policy for such problems. However, its scalability is weak. To overcome this limitation, we developed a Meta-MDP model with actions being individual policies of information gathering based on POMDPs. We compute an optimal exploration policy for each couple of robot and target, and the Meta-MDP model acts as a long-term optimal task allocation algorithm. We experiment our model on a simulation environment and compare to an optimal MPOMDP approach and show promising results on solution quality and scalability.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Araya, M., Buffet, O., Thomas, V., Charpillet, F.: A POMDP extension with belief-dependent rewards. In: Advances in Neural Information Processing Systems, vol. 23 (2010) Araya, M., Buffet, O., Thomas, V., Charpillet, F.: A POMDP extension with belief-dependent rewards. In: Advances in Neural Information Processing Systems, vol. 23 (2010)
2.
Zurück zum Zitat Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of Markov decision processes. Math. Oper. Res. 27(4), 819–840 (2002)MathSciNetCrossRef Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of Markov decision processes. Math. Oper. Res. 27(4), 819–840 (2002)MathSciNetCrossRef
3.
Zurück zum Zitat Capitan, J., Spaan, M.T., Merino, L., Ollero, A.: Decentralized multi-robot cooperation with auctioned POMDPs. Int. J. Robot. Res. 32(6), 650–671 (2013)CrossRef Capitan, J., Spaan, M.T., Merino, L., Ollero, A.: Decentralized multi-robot cooperation with auctioned POMDPs. Int. J. Robot. Res. 32(6), 650–671 (2013)CrossRef
4.
Zurück zum Zitat Carvalho Chanel, C.P., Teichteil-Königsbuch, F., Lesire, C.: POMDP-based online target detection and recognition for autonomous UAVs. In: ECAI 2012, pp. 955–960. IOS Press (2012) Carvalho Chanel, C.P., Teichteil-Königsbuch, F., Lesire, C.: POMDP-based online target detection and recognition for autonomous UAVs. In: ECAI 2012, pp. 955–960. IOS Press (2012)
5.
Zurück zum Zitat Cassandra, A.R.: A survey of POMDP applications. In: Working Notes of AAAI 1998 Fall Symposium on Planning with Partially Observable Markov Decision Processes, vol. 1724 (1998) Cassandra, A.R.: A survey of POMDP applications. In: Working Notes of AAAI 1998 Fall Symposium on Planning with Partially Observable Markov Decision Processes, vol. 1724 (1998)
6.
Zurück zum Zitat Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: AAAI, vol. 94, pp. 1023–1028 (1994) Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: AAAI, vol. 94, pp. 1023–1028 (1994)
7.
Zurück zum Zitat Doshi, P., Gmytrasiewicz, P.J.: Monte Carlo sampling methods for approximating interactive POMDPs. J. Artif. Intell. Res. 34, 297–337 (2009)CrossRef Doshi, P., Gmytrasiewicz, P.J.: Monte Carlo sampling methods for approximating interactive POMDPs. J. Artif. Intell. Res. 34, 297–337 (2009)CrossRef
8.
Zurück zum Zitat Doshi, P., Zeng, Y., Chen, Q.: Graphical models for interactive POMDPs: representations and solutions. Auton. Agent. Multi-Agent Syst. 18(3), 376–416 (2009)CrossRef Doshi, P., Zeng, Y., Chen, Q.: Graphical models for interactive POMDPs: representations and solutions. Auton. Agent. Multi-Agent Syst. 18(3), 376–416 (2009)CrossRef
9.
Zurück zum Zitat Fehr, M., Buffet, O., Thomas, V., Dibangoye, J.: rho-POMDPs have Lipschitz-continuous epsilon-optimal value functions. In: Advances in Neural Information Processing Systems, vol. 31 (2018) Fehr, M., Buffet, O., Thomas, V., Dibangoye, J.: rho-POMDPs have Lipschitz-continuous epsilon-optimal value functions. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
10.
Zurück zum Zitat Gmytrasiewicz, P.J., Doshi, P.: A framework for sequential planning in multi-agent settings. J. Artif. Intell. Res. 24, 49–79 (2005)CrossRef Gmytrasiewicz, P.J., Doshi, P.: A framework for sequential planning in multi-agent settings. J. Artif. Intell. Res. 24, 49–79 (2005)CrossRef
11.
Zurück zum Zitat Hanna, H., Mouaddib, A.I.: Task selection problem under uncertainty as decision-making. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 3, pp. 1303–1308 (2002) Hanna, H., Mouaddib, A.I.: Task selection problem under uncertainty as decision-making. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 3, pp. 1303–1308 (2002)
12.
Zurück zum Zitat Hauskrecht, M., Meuleau, N., Kaelbling, L.P., Dean, T.L., Boutilier, C.: Hierarchical solution of Markov decision processes using macro-actions. arXiv preprint arXiv:1301.7381 (2013) Hauskrecht, M., Meuleau, N., Kaelbling, L.P., Dean, T.L., Boutilier, C.: Hierarchical solution of Markov decision processes using macro-actions. arXiv preprint arXiv:​1301.​7381 (2013)
13.
Zurück zum Zitat Matignon, L., Jeanpierre, L., Mouaddib, A.I.: Distributed value functions for multi-robot exploration. In: 2012 IEEE International Conference on Robotics and Automation, pp. 1544–1550. IEEE (2012) Matignon, L., Jeanpierre, L., Mouaddib, A.I.: Distributed value functions for multi-robot exploration. In: 2012 IEEE International Conference on Robotics and Automation, pp. 1544–1550. IEEE (2012)
15.
Zurück zum Zitat Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949)CrossRef Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949)CrossRef
16.
Zurück zum Zitat Ong, S.C., Png, S.W., Hsu, D., Lee, W.S.: Planning under uncertainty for robotic tasks with mixed observability. Int. J. Robot. Res. 29(8), 1053–1068 (2010)CrossRef Ong, S.C., Png, S.W., Hsu, D., Lee, W.S.: Planning under uncertainty for robotic tasks with mixed observability. Int. J. Robot. Res. 29(8), 1053–1068 (2010)CrossRef
17.
Zurück zum Zitat Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)MathSciNetCrossRef Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)MathSciNetCrossRef
18.
Zurück zum Zitat Pineau, J., Gordon, G., Thrun, S., et al.: Point-based value iteration: an anytime algorithm for pomdps. In: IJCAI, vol. 3, pp. 1025–1032 (2003) Pineau, J., Gordon, G., Thrun, S., et al.: Point-based value iteration: an anytime algorithm for pomdps. In: IJCAI, vol. 3, pp. 1025–1032 (2003)
20.
Zurück zum Zitat Renoux, J.: Contribution to multiagent planning for active information gathering. Ph.D. thesis, Normandie Université (2015) Renoux, J.: Contribution to multiagent planning for active information gathering. Ph.D. thesis, Normandie Université (2015)
21.
Zurück zum Zitat Shani, G., Pineau, J., Kaplow, R.: A survey of point-based POMDP solvers. Auton. Agent. Multi-Agent Syst. 27, 1–51 (2013)CrossRef Shani, G., Pineau, J., Kaplow, R.: A survey of point-based POMDP solvers. Auton. Agent. Multi-Agent Syst. 27, 1–51 (2013)CrossRef
22.
Zurück zum Zitat Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. Advances in neural information processing systems 23 (2010) Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. Advances in neural information processing systems 23 (2010)
24.
Zurück zum Zitat Sondik, E.J.: The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Oper. Res. 26(2), 282–304 (1978)MathSciNetCrossRef Sondik, E.J.: The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Oper. Res. 26(2), 282–304 (1978)MathSciNetCrossRef
25.
Zurück zum Zitat Spaan, M.T., Gonçalves, N., Sequeira, J.: Multirobot coordination by auctioning POMDPs. In: 2010 IEEE International Conference on Robotics and Automation, pp. 1446–1451. IEEE (2010) Spaan, M.T., Gonçalves, N., Sequeira, J.: Multirobot coordination by auctioning POMDPs. In: 2010 IEEE International Conference on Robotics and Automation, pp. 1446–1451. IEEE (2010)
26.
Zurück zum Zitat Spaan, M.T., Veiga, T.S., Lima, P.U.: Active cooperative perception in network robot systems using pomdps. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4800–4805. IEEE (2010) Spaan, M.T., Veiga, T.S., Lima, P.U.: Active cooperative perception in network robot systems using pomdps. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4800–4805. IEEE (2010)
27.
Zurück zum Zitat Szer, D., Charpillet, F., Zilberstein, S.: Maa*: a heuristic search algorithm for solving decentralized pomdps. arXiv preprint arXiv:1207.1359 (2012) Szer, D., Charpillet, F., Zilberstein, S.: Maa*: a heuristic search algorithm for solving decentralized pomdps. arXiv preprint arXiv:​1207.​1359 (2012)
Metadaten
Titel
A Meta-MDP Approach for Information Gathering Heterogeneous Multi-agent Systems
verfasst von
Alvin Gandois
Abdel-Illah Mouaddib
Simon Le Gloannec
Ayman Alfalou
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-59057-3_22

Premium Partner