[1]陈燕,韦紫君,廖宇翔,等.基于RoBERTa和指针网络的中文实体与关系联合抽取方法[J].郑州大学学报(工学版),2026,47(XX):1-10.[doi:10. 13705 / j. issn. 1671-6833. 2025. 05. 007]
 CHEN Yan,WEI Zijun,LIAO Yuxiang,et al.Joint Extraction Method of Chinese Entities and Relations Based on RoBERTa and Pointer Network[J].Journal of Zhengzhou University (Engineering Science),2026,47(XX):1-10.[doi:10. 13705 / j. issn. 1671-6833. 2025. 05. 007]
点击复制

基于RoBERTa和指针网络的中文实体与关系联合抽取方法()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
47
期数:
2026年XX
页码:
1-10
栏目:
出版日期:
2026-09-10

文章信息/Info

Title:
Joint Extraction Method of Chinese Entities and Relations Based on RoBERTa and Pointer Network
作者:
陈燕12 韦紫君2 廖宇翔2 谭志湘2 胡小春34 宋玲2
1. 广西壮族自治区 信息中心广西数字基础设施重点实验室,广西 南宁 530201​;2. 广西大学 计算机与电子信息学院,广西 南宁 530004​;3. 广西财经学院 广西财经大数据重点实验室,广 西南宁 530003​;4. 广西财经学院 大数据与人工智能学院,广西 南宁 530003
Author(s):
CHEN Yan12 WEI Zijun2 LIAO Yuxiang2 TAN Zhixiang2 HU Xiaochun34 SONG Ling2
1. Guangxi Key Laboratory of Digital Infrastructure, Guangxi Zhuang Autonomous Region Information Center, Nanning 530201 China; 2. School of Computer and Electronic Information, Guangxi University, Nanning 530004; 3. Guangxi Key Laboratory of Finance and Economics Big Data, Guangxi University of Finance and Economics, Nanning 530003, Guangxi; 4. School of Big Data and Artificial Intelligence, Guangxi University of Finance and Economics, Nanning 530003, Guangxi
关键词:
实体与关系联合抽取 RoBERTa 指针网络 自然语言处理 深度学习
Keywords:
entity and relation joint extraction RoBERTa pointer network natural language processing deep learning
分类号:
TP391. 1TP312
DOI:
10. 13705 / j. issn. 1671-6833. 2025. 05. 007
文献标志码:
A
摘要:
为了有效解决非结构化文本中实体与关系联合抽取时的三元组重叠问题(SEO或EPO),提出了一种基于RoBERTa和指针网络的中文实体与关系联合抽取方法。首先,针对实体重叠问题,基于指针网络设计了实体识别模块,将实体识别任务构建为“token-pair”识别问题,通过识别实体的开始和结束位置来提取所有可能的实体;其次,针对三元组重叠问题,设计基于多头注意力机制和Ptr-Net的关系抽取模块,将三元组(s, r, o)抽取任务构建为五元组(sₕ, sᵣ, r, oₕ, oᵣ)识别任务;最后,在中文信息抽取数据集DuIE上进行大量实验,所提模型综合性能优于所有基线模型,其精确率、召回率和F1值分别为81.04%、85.82%和83.36%。
Abstract:
To effectively solve the problem of triple overlap in the joint extraction of entities and relations in unstructured text (SEO or EPO). This paper proposes a Chinese entity and relation joint extraction method based on RoBERTa and Pointer Network. Firstly, for the entity overlap problem, this paper designs an entity recognition module based on the pointer network, and constructs the entity recognition task as a “token-pair” recognition problem, which extracts all possible entities by recognizing the start and end positions of the entities. Secondly, for the triplet overlap problem, designing a relation extraction module based on the multi-head attention mechanism and Ptr-Net to construct the triple (s, r, o) extraction task as a quintuple (sₕ, sᵣ, r, oₕ, oᵣ) identification problem. Finally, Extensive experiments on the Chinese information extraction dataset DuIE show that the comprehensive performance of the proposed model is better than all baseline models, with the precision, recall and F1 value of 81.04%, 85.82% and 83.36%.

参考文献/References:

[1] 陈宏, 陈新财, 巩晓赟, 等. 基于知识图谱的风电机组诊断系统构建与应用[J]. 郑州大学学报(工学版), 2023, 44(6): 54-60, 98.
CHEN H, CHEN X C, GONG X B, et al. Construction and application of wind turbine diagnosis system based on knowledge graph[J]. Journal of Zhengzhou University (Engineering Science), 2023, 44(6): 54-60, 98.
[2] LIU Y H, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[EB/OL]. (2019-07-26)[2025-06-08]. https://doi.org/10.48550/arXiv.1907.11692.
[3] ZELENKO D, AONE C, RICHARDELLA A. Kernel methods for relation extraction[C]//Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2002: 71-78.
[4] YU X F, LAM W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C]//Proceedings of the 23rd International Conference on Computational Linguistics. Stroudsburg: ACL, 2010: 1399-1407.
[5] LI Q, JI H. Incremental joint extraction of entity mentions and relations[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2014: 402-412.
[6] MIWA M, SASAKI Y. Modeling joint entity and relation extraction with table representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1858-1869.
[7] WEI Z P, SU J L, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction[EB/OL]. (2019-09-07)[2025-06-05]. https://doi.org/10.48550/arXiv.1909.03227.
[8] WANG Y C, YU B W, ZHANG Y Y, et al. TPLinker: single-stage joint extraction of entities and relations through token pair linking[EB/OL]. (2020-10-26)[2025-06-05]. https://doi.org/10.48550/arXiv.2010.13415.
[9] YAN Z H, ZHANG C, FU J L, et al. A partition filter network for joint entity and relation extraction[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 185-197.
[10] ZHENG H Y, WEN R, CHEN X, et al. PRGC: potential relation and global correspondence based joint relational triple extraction[EB/OL]. (2021-06-18)[2025-06-08]. https://doi.org/10.48550/arXiv.2106.09895.
[11] LI X M, LUO X T, DONG C H, et al. TDEER: an efficient translating decoding schema for joint extraction of entities and relations[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 8055-8064.
[12] SUI D B, ZENG X R, CHEN Y B, et al. Joint entity and relation extraction with set prediction networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9): 12784-12795.
[13] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. (2018-10-11)[2025-06-08]. https://doi.org/10.48550/arXiv.1810.04805.
[14] GAO C, ZHANG X, LI L Y, et al. ERGM: a multi-stage joint entity and relation extraction with global entity match[J]. Knowledge-Based Systems, 2023, 271: 110550.
[15] LI R, LA K J, LEI J S, et al. Joint extraction model of entity relations based on decomposition strategy[J]. Scientific Reports, 2024, 14(1): 1786.
[16] 宋玲, 韦紫君, 陈燕, 等. 基于RoBERTa和指针网络的中文实体与关系联合抽取方法及系统: CN116665359A[P]. 2023-08-29.
SONG Ling, WEI Zijun, CHEN Yan, et al. Joint extraction method and system of chinese entities and relations based on RoBERTa and pointer network[P]. Guangxi Zhuang Autonomous Region: CN202310679557.1,2023-08-29.
[17] CUI Y M, CHE W X, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.
[18] VINYALS O, FORTUNATO M, JAITLY N. Pointer networks[EB/OL]. (2015-06-09)[2025-06-08]. https://doi.org/10.48550/arXiv.1506.03134.
[19] 张强, 曾俊玮, 陈锐. 基于对比学习与梯度惩罚的实体关系联合抽取模型[J]. 吉林大学学报(理学版), 2024, 62(5): 1155-1162.
ZHANG Q, ZENG J W, CHEN R. Entity-relation joint extraction model based on contrastive learning and gradient penalty[J]. Journal of Jilin University (Science Edition), 2024, 62(5): 1155-1162.
[20] LI S J, HE W, SHI Y B, et al. DuIE: A large-scale Chinese dataset for information extraction[C]//Natural Language Processing and Chinese Computing. Natural Language Processing and Chinese Computing: 8th CCF International Conference. Cham: Springer, 2019: 791-800.
[21] LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: a lite BERT for self-supervised learning of language representations[EB/OL]. (2019-09-26)[2025-06-08]. https://doi.org/10.48550/arXiv.1909.11942.
[22] CLARK K, LUONG M T, LE Q V, et al. ELECTRA: pre-training text encoders as discriminators rather than generators[EB/OL]. (2020-03-23)[2025-06-08]. https://doi.org/10.48550/arXiv.2003.10555.

备注/Memo

备注/Memo:
收稿日期:2025-01-06;修订日期:2025 -03-24
基金项目:国家自然科学基金资助项目 (72461001)
通信作者:胡小春(1974— ) ,男,广西南宁人,广西财经学院教授,主要从事大数据与人工智能研究,E-mail:hxch@gxufe.edu.cn。
更新日期/Last Update: 2026-01-14