[1]葛丽娜,陈圆圆,王捷,等.改进的密度峰值聚类算法的差分隐私保护方案[J].郑州大学学报(工学版),2023,44(06):19-24.[doi:10.13705/j.issn.1671-6833.2023.03.010]
 GE Lina,CHEN Yuanyuan,WANG Jie,et al.Differential Privacy Protection Scheme of Adaptive Clustering by Fast Search and Find of Density Peaks[J].Journal of Zhengzhou University (Engineering Science),2023,44(06):19-24.[doi:10.13705/j.issn.1671-6833.2023.03.010]
点击复制

改进的密度峰值聚类算法的差分隐私保护方案()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
44卷
期数:
2023年06期
页码:
19-24
栏目:
出版日期:
2023-09-25

文章信息/Info

Title:
Differential Privacy Protection Scheme of Adaptive Clustering by Fast Search and Find of Density Peaks
作者:
葛丽娜陈圆圆王捷王哲
1. 广西民族大学人工 智 能 学 院,广西 南宁 530006;2. 广西民族大学网络通信工程 重 点 实 验 室,广 西 南宁 530006;3. 广西民族大学 广西混杂计算与集成电路分析设计重点实验室,广西 南宁 530006
Author(s):
GE Lina CHEN Yuanyuan WANG Jie WANG Zhe
1. School of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China; 2. Key Laboratory of Network Communication Engineering, Guangxi Minzu University, Nanning 530006, China; 3. Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis, Guangxi Minzu University, Nanning 530006, China
关键词:
密度峰值 差分隐私 随机噪声 聚类算法
Keywords:
density peaks differential privacy random noise clustering algorithm
DOI:
10.13705/j.issn.1671-6833.2023.03.010
文献标志码:
A
摘要:
针对改进的密度峰值聚类(AdDPC) 算法在计算局部密度时产生的隐私泄露问题以及算法的一次分配策 略,提出一种改进的密度峰值聚类算法的差分隐私保护方案。 该方案在算法计算局部密度的过程中添加 Laplace 随机噪声,使得即使攻击者拥有最大背景知识,也无法通过添加或者删除数据集中的某一点来获取相应的信息,从 而利用差分攻击获取目标数据点的信息,达到保护隐私数据的目的,并且在分配非聚类中心点时引入可达定义改 进 AdDPC 算法的分配策略,避免因为一次分配策略导致数据点分配错误的问题。 实验对比了 DP-rcCFSFDP 算法、 AdAPC-rDP 算法、IDP K-means 算法的 F-Measure 和 ARI,结果表明:当隐私预算大于 1. 5 时,所提算法的 F-Measure 和 ARI 优于其他算法,所提算法能够在保护敏感数据的同时保证数据的可用性。
Abstract:
In order to solve the privacy leakage problem caused by adaptive clustering by fast search and find of density peaks(AdDPC) when calculating the local density and the primary allocation strategy, a differential privacy protection scheme of an improved density peak clustering algorithm was proposed. In this scheme, the Laplace random noise was added in the process of calculating the local density of the algorithm. In this way, even if the attacker had the maximum background knowledge, it could not obtain the corresponding information by adding or deleting a point in the dataset, thereby, differential attack was used to obtain the information of the target data point, and to achieve the purpose of protecting the privacy data. In addition, the reachability definition was introduced to improve the allocation strategy of AdDPC when assigning non-clustered center points, so as to avoid the problem of data point allocation error caused by the one-time allocation strategy. The experiment compared F-Measure and ARI values of DP-rcCFSFDP, AdAPC-rDP, IDP K-means, and results showed that: when the privacy budget was greater than 1. 5, the F-Measure and ARI values of the proposed algorithm were better than those of other algorithms, and this algorithm could protect sensitive data and data availability at the same time.
更新日期/Last Update: 2023-10-22