PcTVI: Parallel MDP Solver Using a Decomposition into Independent Chains

Gareau, Jaël Champagne; Beaudry, Éric; Makarenkov, Vladimir

doi:10.1007/978-3-031-09034-9_12

Jaël Champagne Gareau²²,
Éric Beaudry²² &
Vladimir Makarenkov²²

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Included in the following conference series:

Conference of the International Federation of Classification Societies

1283 Accesses

Abstract

Markov Decision Processes (MDPs) are useful to solve real-world probabilistic planning problems. However, finding an optimal solution in an MDP can take an unreasonable amount of time when the number of states in the MDP is large. In this paper, we present a way to decompose an MDP into Strongly Connected Components (SCCs) and to find dependency chains for these SCCs. We then propose a variant of the Topological Value Iteration (TVI) algorithm, called parallel chained TVI (pcTVI), which is able to solve independent chains of SCCs in parallel leveraging modern multicore computer architectures. The performance of our algorithm was measured by comparing it to the baseline TVI algorithm on a new probabilistic planning domain introduced in this study. Our pcTVI algorithm led to a speedup factor of 20, compared to traditional TVI (on a computer having 32 cores).

Download to read the full chapter text

Chapter PDF

On the Evolution of Planner-Specific Macro Sets

Accelerated decomposition techniques for large discounted Markov decision processes

Article Open access 23 March 2017

A Dynamic Programming-Based MCMC Framework for Solving DCOPs with GPUs

Keywords

References

Champagne Gareau, J., Beaudry E., Makarenkov, V.: A fast electric vehicle planner using clustering. In: Stud. in Classif., Data Anal., and Knowl. Organ., 5, 17–25. Springer (2021)
Google Scholar
Mausam, Kolobov, A.: Planning with Markov Decision Processes: An AI Perspective. Morgan & Claypool (2012)
Google Scholar
Bellman, R.: Dynamic Programming. Prentice Hall (1957)
Google Scholar
Dai, P., Mausam, Weld, D. S., Goldsmith, J.: Topological value iteration algorithms. J. Artif. Intell. Res., 42, 181–209 (2011)
Google Scholar
Bonet, B., Geffner, H.: Labeled RTDP: Improving the convergence of real-time dynamic programming. In: Proc. of ICAPS, pp. 12–21 (2013)
Google Scholar
Hansen, E., Zilberstein, S.: LAO*: A heuristic search algorithm that finds solutions with loops. Artif. Intell., 129(1–2), 35–62 (2001)
Article MathSciNet Google Scholar
Wingate, D., Seppi, K.: P3VI: A partitioned, prioritized, parallel value iterator. In: Proc. Of the Int. Conf. on Mach. Learn. (ICML), 863–870 (2004)
Google Scholar
Bertsekas, D.: Dynamic Programming and Optimal Control, vol. 2. Athena scientific Belmont, MA (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Université du Québec à Montréal, Montreal, Canada
Jaël Champagne Gareau, Éric Beaudry & Vladimir Makarenkov

Authors

Jaël Champagne Gareau
View author publications
You can also search for this author in PubMed Google Scholar
Éric Beaudry
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Makarenkov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaël Champagne Gareau .

Editor information

Editors and Affiliations

Faculty of Economics, University of Porto, Porto, Portugal
Paula Brito
Business Research Unit, University Institute of Lisbon, Lisbon, Portugal
José G. Dias
Department of Mathematical Sciences, University of Essex, Colchester, UK
Berthold Lausen
Department of Statistical Sciences "Paolo Fortunati", University of Bologna, Bologna, Italy
Angela Montanari
Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
Rebecca Nugent

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gareau, J.C., Beaudry, É., Makarenkov, V. (2023). PcTVI: Parallel MDP Solver Using a Decomposition into Independent Chains. In: Brito, P., Dias, J.G., Lausen, B., Montanari, A., Nugent, R. (eds) Classification and Data Science in the Digital Age. IFCS 2022. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-031-09034-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-09034-9_12
Published: 08 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09033-2
Online ISBN: 978-3-031-09034-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

PcTVI: Parallel MDP Solver Using a Decomposition into Independent Chains

Abstract

Chapter PDF

Similar content being viewed by others

On the Evolution of Planner-Specific Macro Sets

Accelerated decomposition techniques for large discounted Markov decision processes

A Dynamic Programming-Based MCMC Framework for Solving DCOPs with GPUs

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

PcTVI: Parallel MDP Solver Using a Decomposition into Independent Chains

Abstract

Chapter PDF

Similar content being viewed by others

On the Evolution of Planner-Specific Macro Sets

Accelerated decomposition techniques for large discounted Markov decision processes

A Dynamic Programming-Based MCMC Framework for Solving DCOPs with GPUs

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation