We consider discounted Markov Decision Processes (MDPs) with large state spaces, aiming to reduce computational complexity and execution time. Existing hierarchical techniques often decompose the state space into strongly connected components (SCCs) across levels. However, they often overlook the importance of SCC size at each level, significantly affecting efficiency. We propose the Parallel Hierarchical Value Iteration (PHVI) algorithm, which efficiently handles large MDPs by considering SCC dimensionality. This approach optimizes multithreading distribution, leading to improved computational performance and reduced execution times. Experimental results demonstrate the PHVI algorithm's effectiveness and superiority over traditional methods in solving complex MDPs.
Read full abstract