[1]尹毅峰,杨显哲,甘 勇,等.基于LightGBM算法的漏洞利用预测研究[J].郑州大学学报(工学版),2022,43(05):24-30.[doi:10.13705/j.issn.1671-6833.2022.05.007]
 YIN Yifeng,YANG Xianzhe,GAN Yong,et al.Research on Prediction of Vulnerability Exploitation Based on LightGBM Algorithm[J].Journal of Zhengzhou University (Engineering Science),2022,43(05):24-30.[doi:10.13705/j.issn.1671-6833.2022.05.007]
点击复制

基于LightGBM算法的漏洞利用预测研究()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
43
期数:
2022年05期
页码:
24-30
栏目:
出版日期:
2022-08-22

文章信息/Info

Title:
Research on Prediction of Vulnerability Exploitation Based on LightGBM Algorithm
作者:
尹毅峰1 杨显哲1 甘 勇2 毛保磊3
1.郑州轻工业大学计算机与通信工程学院;2.郑州工程技术学院信息工程学院;3.郑州大学河南省教育信息安全监测中心;

Author(s):
YIN Yifeng1 YANG Xianzhe1 GAN Yong2 MAO Baolei3
1.College of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China; 
2.School of Information Engineering, Zhengzhou Institute of Engineering and Technology, Zhengzhou 450001, China; 
3.Henan Education Information Security Monitoring Center, Zhengzhou University, Zhengzhou 450001, China
关键词:
Keywords:
vulnerability exploitation security warning LightGBM algorithm vulnerability intelligence analysis network security
分类号:
TP399
DOI:
10.13705/j.issn.1671-6833.2022.05.007
文献标志码:
A
摘要:
为解决企业在面对日益庞大的漏洞修复体量时找不到重点、无从下手等问题,提出了一种基于决策树算法的提升框架 LightGBM ( light gradient boosting machine)的漏洞利用预测模型。 该模型能在海量安全漏洞或者新公开漏洞中预测漏洞是否存在漏洞利用,从而使企业可以优先关注此类漏洞。 首先,通过整理国内外漏洞利用相关的研究成果,发现可被利用的漏洞符合巴莱多定律,并且可以通过机器学习算法实现对公开漏洞的可利用情报预测;其次,收集了近 5 a 的 CVE 漏洞信息以及从 Sebbug、Exploit-DB 等主流漏洞情报平台获取的漏洞利用数据,提取相关特征,构建了一套新的数据集;再次,将漏洞利用预测工作整合为二分类问题,并充分考虑了算法模型在实际工作的场景以及海量数据处理的能力,选取了包括LightGBM、SVM 等在网络安全领域应用较多的算法模型,并进行了建模学习;最后,经过多次仿真实验以及参数优化,发现该模型在准确率、召回率等方面均优于其他模型,分别达到了 83%和 76%,说明该模型具备较好的预测效果和应用价值。 同时研究成果也能为企业信息安全工作提供一定的建设思路与数据参考。
Abstract:
In order to solve the problem that enterprise could not identify key points of the increasing volume of vulnerability and repair them effctively, this paper proposed a model of vulnerability utilization prediction based on the decision tree algorithm, a boosting framework LightGBM (light gradient boosting machine). This model could predict whether there were exploits in a large number of security vulnerabilities or newly disclosed vulnerabilities, so that companies could give priority to such vulnerabilities. At first, studies related to the exploitation of vulnerabilities were reviewed. The exploitable vulnerabilities were found to complied with Baredo′s law, and the exploitable intelligence prediction of public vulnerabilities could be realized through machine learning algorithms. Then CVE vulnerability information in the past 5 a and vulnerability exploitation data obtained from mainstream vulnerability intelligence platforms such as Sebbug and Exploit-DB were collected, to extract relevant features, and construct a new set of data sets. Secondly, the vulnerability exploitation prediction work was integrated into two classification problems, and fully considered the actual working scenarios of the algorithm model and the ability of massive data processing. Algorithm models used in the field of network security were selected, including LightGBM, SVM, etc., and modeling learning was carried out. Finally, after many simulation experiments and parameter optimization, it was found that this model algorithm was superior to other models in terms of accuracy and recall rate, reaching 83% and 76%, respectively, indicating that the model had good prediction effects and application value. At the same time, the results of this paper could also provide certain construction ideas and data references for enterprise information security work.

参考文献/References:

[1] 雷柯楠, 张玉清, 吴晨思, 等. 基于漏洞类型的漏 洞可利用 性 量 化 评 估 系 统 [ J] . 计 算 机 研 究 与 发 展, 2017, 54(10) : 2296-2309. 

LEI K N, ZHANG Y Q, WU C S, et al. A system for scoring the exploitability of vulnerability based types [ J] . Journal of computer research and development, 2017, 54(10) : 2296-2309.
 [2] 王东. 基于模糊测试的 IoT 设备漏洞挖掘方法研究 [D] . 成都: 电子科技大学, 2020. 
WANG D. Research on fuzzing-based vulnerability discovery technique for IoT devices[D] . Chengdu: University of Electronic Science and Technology of China, 2020. 
[3] 张兵, 宁多彪, 赵跃龙. 基于系统调用的 0day 攻击 路径检测系统[ J] . 计算机工程与设计, 2015, 36 (5) : 1176-1180. 
ZHANG B, NING D B, ZHAO Y L. System call based 0day attack path detecting system[ J] . Computer engineering and design, 2015, 36(5) : 1176-1180.
[4] 刘泽宇. 网络安全信息预警制度性成因和建设路 径相关性要素的实证研究[ J] . 网络安全技术与应 用, 2020(3) : 10-15. 
LIU Z Y. An empirical study on the institutional causes of cybersecurity information early warning and the correlation elements of construction paths[ J] . Network security technology & application, 2020(3) : 10-15. 
[5] BULLOUGH B L, YANCHENKO A K, SMITH C L, et al. Predicting exploitation of disclosed software vulnerabilities using open-source data [ C] ∥Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics. New York: ACM,2017:45-53. 
[6] 陈钧衍, 陶非凡, 张源. 基于序列标注的漏洞信息 结构化抽取方法 [ J] . 计 算 机 应 用 与 软 件, 2020, 37(2) : 266-271, 276.
 CHEN J Y, TAO F F, ZHANG Y. Structured extraction method for vulnerability information based on sequence labeling [ J] . Computer applications and software, 2020, 37(2) : 266-271, 276.
 [7] PAN Q J, TANG W L, YAO S Y. The application of LightGBM in microsoft malware detection[J]. Journal of physics: conference series, 2020, 1684(1): 012041. 
[8] 徐国天, 沈耀童. 基于 XGBoost 和 LightGBM 双层 模型的恶 意 软 件 检 测 方 法 [ J ] . 信 息 网 络 安 全, 2020, 20(12) : 54-63. 
XU G T, SHEN Y T. A malware detection method based on XGBoost and LightGBM two-layer model[ J] . Netinfo security, 2020, 20(12) : 54-63. 
[9] 王炎, 刘嘉勇, 刘亮, 等. 漏洞利用工具研发框架 研究[ J] . 计算机工程, 2018, 44(3) : 127-131. 
WANG Y, LIU J Y, LIU L, et al. Research on vulnerability utilization tool development framework [ J] . Computer engineering, 2018, 44(3) : 127-131. 
[10] 张必彦, 王孟. 基于 CVSS 漏洞评分标准的网络攻 防量化方法研究 [ J] . 兵 器 装 备 工 程 学 报, 2018, 39(4) : 147-150. 
ZHANG B Y, WANG M. Research on quantization method of network attack and defense based on CVSS vulnerability score[ J] . Journal of ordnance equipment engineering, 2018, 39(4) : 147-150.
 [11] KERAMATI M, AKBARI A. CVSS-based security metrics for quantitative analysis of attack graphs[C] / / 3th International Conference on Computer and Knowledge Engineering ( ICCKE ) . Piscataway: IEEE, 2013:178-183.
 [12] 徐伟华. 基 于 CVSS 的 漏 洞 风 险 评 估 方 法 研 究 [D] . 天津: 中国民航大学, 2017. XU W H. Research on vulnerability risk assessment method based on CVSS [ D] . Tianjin: Civil Aviation University of China, 2017. 
[13] 彭成, 展万里, 周晓红. 基于随机森林的异常邮件 检测方 法 研 究 与 实 现 [ J] . 湖 南 工 业 大 学 学 报, 2020, 34(1) : 70-76. 
PENG C, ZHAN W L, ZHOU X H. Research and implementation of abnormal mail detection method based on random forest algorithm[ J] . Journal of Hunan university of technology, 2020, 34(1) : 70-76.
 [14] 陈晓楠, 胡建敏, 陈茜, 等. 基于 LightGBM 算法的 网络战仿真与效能评 估 [ J] . 计 算 机 应 用, 2020, 40(7) : 2003-2008. 
CHEN X N, HU J M, CHEN X, et al. Simulation and effectiveness evaluation of network warfare based on LightGBM algorithm[ J] . Journal of computer applications, 2020, 40(7) : 2003-2008. 
[15] LI Z J, SHAO Y. A survey of feature selection for vulnerability prediction using feature-based machine learning [ C ] ∥Proceedings of the 11th International Conference on Machine Learning and Computing. New York: ACM, 2019: 36-42. 
[16] KAYA A, KECELI A S, CATAL C, et al. The impact of feature types, classifiers, and data balancing techniques on software vulnerability prediction models[ J] . Journal of software: evolution and process, 2019, 31 (9) : e2164. 
[17] 南东亮, 王维庆, 王海云. 基于消息队列的 LightGBM 超参数优化[ J] . 计算机工程与科学, 2019, 41 (8) : 1360-1365.
 NAN D L, WANG W Q, WANG H Y. Optimization of LightGBM hyper-parameters based on message queuing [ J] . Computer engineering & science, 2019, 41(8) : 1360-1365. 
[18] 张蕾, 崔勇, 刘静, 等. 机器学习在网络空间安全 研究中 的 应 用 [ J] . 计 算 机 学 报, 2018, 41 ( 9) : 1943-1975. 
ZHANG L, CUI Y, LIU J, et al. Application of machine learning in cyberspace security research [ J ] . Chinese journal of computers, 2018, 41 ( 9 ) : 1943-1975. 
[19] 刘绍廷, 杨孟英, 朱广全, 等. 机器学习在 SQL 注 入攻击 检 测 中 的 应 用 [ J ] . 河 南 科 技, 2021, 40 (8) : 23-27. 
LIU S T, YANG M Y, ZHU G Q, et al. Application of machine learning in SQL injection attack detection [ J] . Henan science and technology, 2021, 40 ( 8) : 23-27.

更新日期/Last Update: 2022-08-20