eprintid: 27154 rev_number: 8 eprint_status: archive userid: 2 dir: disk0/00/02/71/54 datestamp: 2026-02-04 23:30:14 lastmod: 2026-02-04 23:30:15 status_changed: 2026-02-04 23:30:14 type: article metadata_visibility: show creators_name: ur Rehman, Hafiz Muhammad Raza creators_name: Gul, M. Junaid creators_name: Younas, Rabbiya creators_name: Jhandir, Muhammad Zeeshan creators_name: Álvarez, Roberto Marcelo creators_name: Miró Vera, Yini Airet creators_name: Ashraf, Imran creators_id: creators_id: creators_id: creators_id: creators_id: roberto.alvarez@uneatlantico.es creators_id: yini.miro@uneatlantico.es creators_id: title: End-to-end emergency response protocol for tunnel accidents augmentation with reinforcement learning ispublished: pub subjects: uneat_eng divisions: uneatlantico_produccion_cientifica divisions: uninimx_produccion_cientifica divisions: uninipr_produccion_cientifica divisions: unic_produccion_cientifica divisions: uniromana_produccion_cientifica full_text_status: public keywords: Robotic systems; drones; multi-agents system; path finding; reinforcement learning; tunnel hazards; unmanned aerial vehicles abstract: Autonomous unmanned aerial vehicles (UAVs) offer cost-effective and flexible solutions for a wide range of real-world applications, particularly in hazardous and time-critical environments. Their ability to navigate autonomously, communicate rapidly, and avoid collisions makes UAVs well suited for emergency response scenarios. However, real-time path planning in dynamic and unpredictable environments remains a major challenge, especially in confined tunnel infrastructures where accidents may trigger fires, smoke propagation, debris, and rapid environmental changes. In such conditions, conventional preplanned or model-based navigation approaches often fail due to limited visibility, narrow passages, and the absence of reliable localization signals. To address these challenges, this work proposes an end-to-end emergency response framework for tunnel accidents based on Multi-Agent Reinforcement Learning (MARL). Each UAV operates as an independent learning agent using an Independent Q-Learning paradigm, enabling real-time decision-making under limited computational resources. To mitigate premature convergence and local optima during exploration, Grey Wolf Optimization (GWO) is integrated as a policy-guidance mechanism within the reinforcement learning (RL) framework. A customized reward function is designed to prioritize victim discovery, penalize unsafe behavior, and explicitly discourage redundant exploration among agents. The proposed approach is evaluated using a frontier-based exploration simulator under both single-agent and multi-agent settings with multiple goals. Extensive simulation results demonstrate that the proposed framework achieves faster goal discovery, improved map coverage, and reduced rescue time compared to state-of-the-art GWO-based exploration and random search algorithms. These results highlight the effectiveness of lightweight MARL-based coordination for autonomous UAV-assisted tunnel emergency response. date: 2026-01 publication: Scientific Reports id_number: doi:10.1038/s41598-026-37191-w refereed: TRUE issn: 2045-2322 official_url: http://doi.org/10.1038/s41598-026-37191-w access: open language: en citation: Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica Universidad Internacional do Cuanza > Investigación > Artículos y libros Universidad de La Romana > Investigación > Producción Científica Abierto Inglés Autonomous unmanned aerial vehicles (UAVs) offer cost-effective and flexible solutions for a wide range of real-world applications, particularly in hazardous and time-critical environments. Their ability to navigate autonomously, communicate rapidly, and avoid collisions makes UAVs well suited for emergency response scenarios. However, real-time path planning in dynamic and unpredictable environments remains a major challenge, especially in confined tunnel infrastructures where accidents may trigger fires, smoke propagation, debris, and rapid environmental changes. In such conditions, conventional preplanned or model-based navigation approaches often fail due to limited visibility, narrow passages, and the absence of reliable localization signals. To address these challenges, this work proposes an end-to-end emergency response framework for tunnel accidents based on Multi-Agent Reinforcement Learning (MARL). Each UAV operates as an independent learning agent using an Independent Q-Learning paradigm, enabling real-time decision-making under limited computational resources. To mitigate premature convergence and local optima during exploration, Grey Wolf Optimization (GWO) is integrated as a policy-guidance mechanism within the reinforcement learning (RL) framework. A customized reward function is designed to prioritize victim discovery, penalize unsafe behavior, and explicitly discourage redundant exploration among agents. The proposed approach is evaluated using a frontier-based exploration simulator under both single-agent and multi-agent settings with multiple goals. Extensive simulation results demonstrate that the proposed framework achieves faster goal discovery, improved map coverage, and reduced rescue time compared to state-of-the-art GWO-based exploration and random search algorithms. These results highlight the effectiveness of lightweight MARL-based coordination for autonomous UAV-assisted tunnel emergency response. metadata ur Rehman, Hafiz Muhammad Raza; Gul, M. Junaid; Younas, Rabbiya; Jhandir, Muhammad Zeeshan; Álvarez, Roberto Marcelo; Miró Vera, Yini Airet y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, roberto.alvarez@uneatlantico.es, yini.miro@uneatlantico.es, SIN ESPECIFICAR (2026) End-to-end emergency response protocol for tunnel accidents augmentation with reinforcement learning. Scientific Reports. ISSN 2045-2322 document_url: http://repositorio.unic.co.ao/id/eprint/27154/1/s41598-026-37191-w_reference.pdf