[1]黄万伟,郑向雨,张超钦,等.基于深度强化学习的智能路由技术研究[J].郑州大学学报(工学版),2023,44(01):44-51.
 HUANG W W,ZHENG X Y,ZHANG C Q,et al.Research on Intelligent Routing Technology ba<x>sed on Deep Reinforcement Learning in SDN[J].Journal of Zhengzhou University (Engineering Science),2023,44(01):44-51.
点击复制

基于深度强化学习的智能路由技术研究()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
44
期数:
2023年01期
页码:
44-51
栏目:
出版日期:
2022-12-06

文章信息/Info

Title:
Research on Intelligent Routing Technology ba<x>sed on Deep Reinforcement Learning in SDN
作者:
黄万伟 郑向雨 张超钦 王苏南 张校辉
Author(s):
HUANG W W ZHENG X Y ZHANG C Q et al.
Research on Intelligent Routing Technology Based on Deep Reinforcement
文献标志码:
A
摘要:
针对现有智能路由算法收敛速度慢、平均时延高、带宽利用率低等问题,提出了一种基于深度强化学习 ( DRL) 的多路径智能路由算法RDPG-Route。该算法采用循环确定性策略梯度( RDPG) 作为训练框架,引入长短期 记忆网络( LSTM) 作为神经网络,基于RDPG 处理高纬度问题的算法优势,以及LSTM 循环核中记忆体的存储能 力,将动态变化的网络状态输入神经网络进行训练。算法训练收敛后,将神经网络输出的动作值作为网络链路权 重,基于多路径路由策略进行流量划分,以实现网络路由的智能动态调整。最后,将RDPG-Route 路由算法分别与 ECMP、DRL-TE 和DRL-R-DDPG 路由算法进行对比。结果表明,RDPG-Route 具有较好的收敛性和有效性,相比于 其他智能路由算法至少降低了7. 2%平均端到端时延,提高了6. 5%吞吐量,减少了8. 9%丢包率和6. 3%的最大链 路利用率。
Abstract:
To solve the problems of slow convergence speed, high average delay, and low bandwidth utilization of existing intelligent routing algorithms, in this study, a multi-path intelligent routing algorithm RDPG-Route based on deep reinforcement learning (DRL) was proposed. In the algorithm, the recurrent determi-nistic policy gradient (RDPG) was used as the training framework, the long short-term memory (LSTM) was introduced as the neural network. The algorithm advantages of RDPG were used to handle high-latitude problems and the storage capacity of the memory in the LSTM loop core, the dynamically changing network state could be input to the neural network for training. After the algorithm training converged, the action value output by the neural network was used as the network link weight, and the traffic was divided based on the multi-path routing strategy to realize the intelligent dynamic adjustment of the network routing. Finally, RDPG-Route routing algorithm was compared with ECMP, DRLTE, and DRL-R-DDPG routing algorithms respectively. The results indicated that RDPG-Route had better convergence and effectiveness. Compared with other optimal intelligent routing algorithm, RDPG-Route could reduce the average end-to-end delay by at least 7. 2%, improve the throughput by 6. 5%, and reduce the packet loss rate by 8. 9% and the maximum link utilization rate by 6. 3%.
更新日期/Last Update: 2022-12-07