[1]张震,春美洁,田鸿朋,等.自适应子空间插补的不完整数据证据集成分类[J].郑州大学学报(工学版),2027,48(XX):1-8.[doi:10.13705/j.issn.1671-6833.2026.04.006]
 ZHANG Zhen,CHUN Meijie,TIAN Hongpeng,et al.Ensemble Classification of Incomplete Data Evidence on Adaptive Subspace Imputation[J].Journal of Zhengzhou University (Engineering Science),2027,48(XX):1-8.[doi:10.13705/j.issn.1671-6833.2026.04.006]
点击复制

自适应子空间插补的不完整数据证据集成分类()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
48
期数:
2027年XX
页码:
1-8
栏目:
出版日期:
2027-12-10

文章信息/Info

Title:
Ensemble Classification of Incomplete Data Evidence on Adaptive Subspace Imputation
作者:
张震12春美洁1田鸿朋2 李友好3黄伟涛3张俊杰3
1.郑州大学 河南先进技术研究院, 河南 郑州 450001;2.郑州大学 电气与信息工程学院,河南 郑州 450001;3.河南汇融油气技术有限公司,河南 郑州 450001
Author(s):
ZHANG Zhen12 CHUN Meijie1 TIAN Hongpeng2LI Youhao3HUANG Weitao3 ZHANG Junjie3
1. Henan Institute of Advanced Technology, Zhengzhou University, Zhengzhou 450001; 2.School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001;3.Henan Huirong Oil and Gas Technology Co., Ltd.,Zhengzhou 450001, China
关键词:
不完整数据分类全局重要性局部重要性证据理论
Keywords:
incomplete data classify global importance local importanceevidential reasoning
分类号:
TP181
DOI:
10.13705/j.issn.1671-6833.2026.04.006
文献标志码:
A
摘要:
针对基于插补的分类方法在处理缺失数据时的估计值偏差会影响分类性能的问题,提出一种基于自适应子空间插补的不完整数据证据集成分类方法,利用自适应子空间插补和双重证据集成来提升模型对不完整数据集的分类能力。首先,使用谱聚类将特征空间动态划分为多个子空间,在每个子空间内独立进行基于近邻的缺失值插补;其次,设计了一种双重重要性评估机制,计算插补前后训练集数据分布的差异来评估全局重要性,并通过评估分类模型对测试集样本在训练集中近邻的分类能力来评估其分类结果的局部重要性;最后,在证据理论的基础上融合局部重要性和全局重要性,利用不同子空间信息的互补性提升分类性能。在标准数据集上的对比实验表明,所提方法在ARI和AP指标中相比次优方法的提升幅度分别高达6.23个百分点和0.82个百分点,验证了所提方法的有效性和先进性。
Abstract:
In response to the issue of biased estimates affecting classification performance in imputation-based classification methods when dealing with missing data, an incomplete data evidence ensemble classification method based on adaptive subspace imputation was proposed. The proposed method utilized adaptive subspace imputation and dual evidence integration to enhance the model’s classification ability on incomplete datasets. Firstly, spectral clustering was used to dynamically partition the feature space into multiple subspaces, where missing value imputation based on neighbors was performed independently within each subspace. Secondly, a dual importance evaluation mechanism was designed, which calculated the difference in the data distribution before and after imputation in the training set to assess global importance, and evaluated the local importance of classification results by assessing the classification capacity of the classification model on the test set samples’ neighbors in the training set. Finally, based on evidence theory, local and global importance were fused to enhance classification performance by leveraging the complementarity of information from different subspaces. Comparative experiments on standard datasets showed that the proposed method achieved improvements of up to 6.23 percentage and 0.82 percentage, respectively, in the ARI and AP metrics compared to suboptimal methods, validating the effectiveness and advancement of the proposed method.

参考文献/References:

[1] 吴晗, 王士同. 不完整数据分类与缺失信息重要性识别特权LSSVM[J]. 智能系统学报, 2023,18(4):743-753.
WU H, WANG S T. Privileged LSSVM for classification and simultaneous importance identification of missing information on incomplete data[J]. CAAI Transactions on Intelligent Systems, 2023,18(4):743-753.
[2] RADFORD A, METZ L, CHINTALA S, et al. Unsupervised representation learning with deep convolutional generative adversarial networks[EB/OL]. (2016-01-07)[2025-09-19]. https://arxiv.org/abs/1511.06434.
[3] 徐鸿艳, 孙玉琴, 秦山玮, 等. 缺失数据插补方法性能比较分析[J]. 软件工程, 2021,24(11):11-14.
XU H Y, SUN Y Q, QIN S W, et al. Comparative analysis of the performance of imputation methods for missing data[J]. Software Engineering, 2021,24(11):11-14.
[4] SHAFER G. A Mathematical Theory of Evidence[M]. Princeton: Princeton University Press, 1976.
[5] TIAN H P, ZHANG Z W, DING W P. Incomplete data transfer calibration classification[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024,8(5):3244-3256.
[6] TIAN H P, WANG L X, TAN Y G. Incomplete data evidential classification with inconsistent distribution[J]. Information Sciences, 2024,626:120824.
[7] 段中尧, 毕瀚元, 张伟伟. 基于D-S证据理论的不完整数据混合分类算法[J]. 信息与控制, 2020,49(4):455-463,471.
DUAN Z X, BI H Y, ZHANG W W. A D-S evidence reasoning-based hybrid classification algorithm for incomplete data[J]. Information and Control, 2020,49(4):455-463,471.
[8] RAZAVI-FAR R, CHENG B Y, SAIF M, et al. Similarity-learning information-schemes for missing data imputation[J]. Knowledge-Based Systems, 2020,187:104805.
[9] 詹兆康, 胡旭光, 赵浩然, 等. 基于多变量时空融合网络的风机数据缺失值插补研究[J]. 自动化学报, 2024,50(6):1171-1184.
ZHAN Z K, HU X G, ZHAO H R, et al. Study of missing value imputation in wind turbine data based on multivariable spatiotemporal integration network[J]. Acta Automatica Sinica, 2024,50(6):1171-1184.
[10] YOON J, JORDON J, VAN DER SCHAAR M. GAIN: missing data imputation using generative adversarial nets[EB/OL]. (2018-06-07)[2025-09-19]. https://arxiv.org/abs/1806.02920.
[11] WEN Y Z, WANG Y W, YI K, et al. DiffImpute: tabular data imputation with denoising diffusion probabilistic model[C]//2024 IEEE International Conference on Multimedia and Expo (ICME). Piscataway: IEEE, 2024:1-6.
[12] ZHANG Z, TIAN H P. Hybrid imputation-based optimal evidential classification for missing data[J]. Applied Intelligence, 2025,55(1):1-18.
[13] 陶洋, 祝小钧, 杨柳. 基于皮尔逊相关系数和信息熵的多传感器数据融合[J]. 小型微型计算机系统, 2023,44(5):1075-1080.
TAO Y, ZHU X J, YANG L. Multi-sensor data fusion based on Pearson coefficient and information entropy[J]. Journal of Chinese Computer Systems, 2023,44(5):1075-1080.
[14] 原红, 张鸿雁. 基于高效谱聚类算法的文本特征分割研究[J]. 长江信息通信, 2025,38(5):171-173.
YUAN H, ZHANG H Y. Research on text feature segmentation based on the efficient spectral clustering algorithm[J]. Changjiang Information and Communications, 2025,38(5):171-173.
[15] 晏玲, 赵海良. 基于差分隐私k-means++的一种隐私预算分配方法[J]. 信息安全研究, 2025,11(8):710-717.
YAN L, ZHAO H L. A privacy budget allocation method based on differential privacy k-means++[J]. Journal of Cyberspace Security Research, 2025,11(8):710-717.
[16] KELLY M, LONGJOHN R, NOTTINGHAM K. The UCI Machine Learning Repository[EB/OL]. [2025-09-19]. https://archive.ics.uci.edu.
[17] 付海涛, 张智勇, 王增辉, 等. 改进SHO算法优化随机森林模型[J]. 吉林大学学报(理学版), 2025,63(3):861-866.
FU H T, ZHANG Z Y, WANG Z H, et al. Improve SHO algorithm to optimize random forest model[J]. Journal of Jilin University (Science Edition), 2025,63(3):861-866.
[18] 李松, 刘晓楠, 刘娟. 基于JS散度的不确定数据密度峰值聚类算法[J]. 吉林大学学报(工学版), 2024,54(7):2038-2048.
LI S, LIU X N, LIU J. Peak clustering algorithm for uncertain data density based on JS divergence[J]. Journal of Jilin University (Engineering and Technology Edition), 2024,54(7):2038-2048.
[19] 田鸿朋, 张震, 张思源, 等. 复合可靠性分析下的不平衡数据证据分类[J]. 郑州大学学报(工学版), 2023,44(4):22-28.
TIAN H P, ZHANG Z, ZHANG S Y, et al. Imbalanced data evidential classification with composite reliability[J]. Journal of Zhengzhou University (Engineering Science), 2023,44(4):22-28.
[20] SMETS P. Decision making in the TBM: the necessity of the pignistic transformation[J]. International Journal of Approximate Reasoning, 2005,38(2):133-147.
[21] HUBERT L, ARABIE P. Comparing partitions[J]. Journal of Classification, 1985,2(1):193-218.
[22] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,39(6):1137-1149.
[23] 岳勇, 田考聪. 数据缺失及其填补方法综述[J]. 预防医学情报杂志, 2005,21(6):683-685.
YUE Y, TIAN K C. Summary of data missing and its filling methods[J]. Journal of Preventive Medicine Information, 2005,21(6):683-685.
[24] VAN BUUREN S, GROOTHUIS-OUDSHORN K. Mice: multivariate imputation by chained equations in R[J]. Journal of Statistical Software, 2011,45(3):1-67.
[25] 马宗方, 马祥双, 宋琳, 等. 异常信息的智能分类算法研究[J]. 计算机测量与控制, 2021,29(10):164-169.
MA Z F, MA X S, SONG L, et al. Incomplete data belief classification algorithm based on adaptive KNN imputation[J]. Computer Measurement & Control, 2021,29(10):164-169.
[26] LAI X M, ZHANG Z, CHEN H, et al. Tracking-removed neural network with graph information for classification of incomplete data[J]. Applied Intelligence, 2025,55(3):1-20.

备注/Memo

备注/Memo:
收稿日期:2025-09-30;修订日期:2025-10-30
基金项目:河南省重点研发专项(231111211600) ;河南省国际科技合作重点项目(231111520300)
作者简介:张震(1966— ) ,男,河南郑州人,郑州大学教授,博士,博士生导师,主要从事计算机视觉、数据挖掘、智能信息处理研究,E-mail:zhangzhen66@126.com。
更新日期/Last Update: 2026-02-27