A Survey of Research and Applications of Optimal Path Planning Based on Deep Reinforcement Learning

Open Access

Issue		ITM Web Conf. Volume 73, 2025 International Workshop on Advanced Applications of Deep Learning in Image Processing (IWADI 2024)


Article Number		01003
Number of page(s)		10
Section		Reinforcement Learning and Optimization Techniques
DOI		https://doi.org/10.1051/itmconf/20257301003
Published online		17 February 2025

S. M. LaValle, Planning Algorithms. (Cambridge University Press, London, 2006) [CrossRef] [Google Scholar]
P. E Hart, N. J. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4, 100 (1968) [CrossRef] [Google Scholar]
D. Silver, Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484 (2016) [CrossRef] [Google Scholar]
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning. ArXiv preprint ArXiv:1509.02971, (2015) [Google Scholar]
V. Mnih, Human-level control through deep reinforcement learning. Nature, 518, 529 (2015). [CrossRef] [Google Scholar]
H. Kretzschmar, Socially compliant mobile robot navigation via inverse reinforcement learning. The International Journal of Robotics Research, 35, 1289 (2016) [CrossRef] [Google Scholar]
J. Schulman, Trust region policy optimization. Proceedings of the 32nd International Conference on Machine Learning, 1889 (2015) [Google Scholar]
L. Pinto, A. Gupta, Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), (2015) pp. 3406-3413 [Google Scholar]
C. Finn, Model-agnostic meta-learning for fast adaptation of deep networks. ArXiv preprint arXiv:1703.03400, (2017) [Google Scholar]
E. W. Dijkstra, A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269 (1959) [Google Scholar]
V. Lumelsky, A. Stepanov, Dynamic path planning for a mobile automaton with limited information on the environment. IEEE Transactions on Automatic Control, 31, 1058 (1987) [Google Scholar]
Lozano-Perez, T. Spatial planning: A configuration space approach. IEEE Transactions on Computers, 100, 108 (1983) [CrossRef] [MathSciNet] [Google Scholar]
Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. The International Journal of Robotics Research, 5, 90 (1986) [CrossRef] [Google Scholar]
S. S Ge, Y. J. Cui, Dynamic motion planning for mobile robots using potential field method. Autonomous Robots, 13, 207 (2002) [CrossRef] [Google Scholar]
S. M. LaValle, J. J. Kuffner, Randomized kinodynamic planning, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), (2001) [Google Scholar]
L. E. Kavraki, P. Svestka, J. C. Latombe, M. H. Overmars, Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12, 566 (1996) [CrossRef] [Google Scholar]
Y Zhu, Target-driven visual navigation in indoor scenes using deep reinforcement learning. IEEE International Conference on Robotics and Automation (ICRA), (2017) pp. 3357-3364 [Google Scholar]
P Mirowski, R. Pascanu, F. Viola, H. Soyer, A. Ballard, A. Banino, M. Denil, R. Goroshin, L. Sifre, K. Kavukcuoglu, Learning to navigate in complex environments. arXiv preprint arXiv:1611. (2016). [Google Scholar]
J. Schulman, Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347. (2017) [Google Scholar]
B. Wang, End-to-End Autonomous Driving with Deep Reinforcement Learning in Simulation Environments, Technical University of Dresden, (2024) [Google Scholar]
C. Berner, G. Brockman, B. Chan, V. Cheung, P. Debiak, Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680. (2019) [Google Scholar]
S. Gu, Deep reinforcement learning for robotic manipulation with asynchronous off- policy updates, 2017 IEEE International Conference on Robotics and Automation (ICRA), (2016) pp. 3389-3396 [Google Scholar]
D. Zhang, Z. Xuan, J. Yao, X. Li, Path planning of unmanned aerial vehicle in complex environments based on state-detection twin delayed deep deterministic policy gradient. Machines, 11, 108. (2023) [CrossRef] [Google Scholar]
D. Brunori, DAMIAN: A delay-aware multI-aerial navigation environment for cooperative DRL-based UAV systems. PhD Thesis, Università di Roma La Sapienza. (2024) [Google Scholar]
Z. Wang, W. Gao, G. Li, Z Wang, Path Planning for Unmanned Aerial Vehicle via Off- Policy Reinforcement Learning with Enhanced Exploration. IEEE Transactions on Intelligent Transportation Systems. June, June (2024), pp. 2625-2639 [Google Scholar]
Z. Chu, Y. Wang, D. Zhu, Local 2-D Path Planning of Unmanned Underwater Vehicles in Continuous Action Space Based on the Twin-Delayed Deep Deterministic Policy Gradient. IEEE Transactions on Systems, Man, and Cybernetics: Systems. May , May (2024), pp. 2775-2785 [Google Scholar]
C. Chronis, G. Anagnostopoulos, E. Politi, Dynamic navigation in unconstrained environments using reinforcement learning algorithms. IEEE Transactions on Vehicular Technology, 72, 102 (2023) [CrossRef] [Google Scholar]
P. Li, D. Chen, Y. Wang, L. Zhang, S. Zhao, Path Planning of Mobile Robot Based on Improved TD3 Algorithm in Dynamic Environment. Heliyon, 10, e2405 (2024). [Google Scholar]
M. Quinones-Ramirez, J. Rios-Martinez, Robot path planning using deep reinforcement learning. arXiv preprint arXiv:2302.09120. (2023) [Google Scholar]
Y. Zhang, W. Zhao, J. Wang, Y. Yuan, Recent progress, challenges and future prospects of applied deep reinforcement learning: A practical perspective in path planning. Neurocomputing. 608, 128423 (2024) [CrossRef] [Google Scholar]
Y. Xiaofei, S. Yilun, L. Wei, Y. Hui, Z. Weibo, Global path planning algorithm based on double DQN for multi-tasks amphibious unmanned surface vehicle. Ocean Engineering, 244, 112345 (2022) [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.