[1]尹毅峰,杨显哲,甘 勇,等.基于LightGBM算法的漏洞利用预测研究[J].郑州大学学报(工学版),2022,43(05):24-30.
 Research on Prediction of Vulnerability Exploitation ba<x>sed on LightGBM Algorithm[J].Journal of Zhengzhou University (Engineering Science),2022,43(05):24-30.
点击复制

基于LightGBM算法的漏洞利用预测研究()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
43
期数:
2022年05期
页码:
24-30
栏目:
出版日期:
2022-08-22

文章信息/Info

Title:
Research on Prediction of Vulnerability Exploitation ba<x>sed on LightGBM Algorithm
作者:
尹毅峰1 杨显哲1 甘 勇2 毛保磊3
文献标志码:
A
摘要:
为解决企业在面对日益庞大的漏洞修复体量时找不到重点、无从下手等问题,提出了一种基于决策树算法的提升框架 LightGBM ( light gradient boosting machine)的漏洞利用预测模型。 该模型能在海量安全漏洞或者新公开漏洞中预测漏洞是否存在漏洞利用,从而使企业可以优先关注此类漏洞。 首先,通过整理国内外漏洞利用相关的研究成果,发现可被利用的漏洞符合巴莱多定律,并且可以通过机器学习算法实现对公开漏洞的可利用情报预测;其次,收集了近 5 a 的 CVE 漏洞信息以及从 Sebbug、Exploit-DB 等主流漏洞情报平台获取的漏洞利用数据,提取相关特征,构建了一套新的数据集;再次,将漏洞利用预测工作整合为二分类问题,并充分考虑了算法模型在实际工作的场景以及海量数据处理的能力,选取了包括LightGBM、SVM 等在网络安全领域应用较多的算法模型,并进行了建模学习;最后,经过多次仿真实验以及参数优化,发现该模型在准确率、召回率等方面均优于其他模型,分别达到了 83%和 76%,说明该模型具备较好的预测效果和应用价值。 同时研究成果也能为企业信息安全工作提供一定的建设思路与数据参考。
Abstract:
In order lo solve the problem thal enlerprise could not identify key points of the increasing volumeof vulnerability and repair them elfetively ,this paper proposed a model of vulnerability utilization predictionbased on the decision tree algorithm ,a boosting framework LightGBM ( light gradient boosting machine ) . Thismodel could predict whether there were exploits in a large number of security vulnerabilitics or newly disclosedvulnerabilities,so that companies could give priority to such vulnerabilities. At first,studies related to theexploitation of vulnerabilities were reviewed. ’The exploilable vulnerabilities were found to complied with Bare-do’s law , and the exploitable intelligence prediction of public vulnerabilities could be realized through machinelearning algorihms. Then CVE vulnerability information in the past 5 a and vulnerability exploitation dataobtained from mainstream vulnerability intelligence plaltforms such as Sebbug and Exploit-DB were collected,to extracl relevant features ,and construct a new set of data sets. Secondly ,the vulnerability exploilation prediction work was integraled into lwo classification problems , and fully considered the actual working scenariosof the algorithm model and the ability of massive data processing. Algorithm models used in the field of net-work security were selected ,including LightGBM ,SVM ,etc. , and modeling learning was carried out. Finally , after many simulation experiments and parameter optimization,it was found that this model algorithm wassuperior to other models in terms of accuracy and recall rate ,reaching 83%e and 76%e , respectively, indicatingthat the model had good prediction effects and application value. At the same time,the results of this papercould also provide certain construction ideas and data references for enterprise information security work.
更新日期/Last Update: 2022-08-20