[1]CHEN Yan,WEI Zijun,LIAO Yuxiang,et al.Joint Extraction Method of Chinese Entities and Relations Based on RoBERTa and Pointer Network[J].Journal of Zhengzhou University (Engineering Science),2026,47(XX):1-10.[doi:10.13705/j.issn.1671-6833.2025.05.007]
Copy
Journal of Zhengzhou University (Engineering Science)[ISSN
1671-6833/CN
41-1339/T] Volume:
47
Number of periods:
2026 XX
Page number:
1-10
Column:
Public date:
2026-09-10
- Title:
-
Joint Extraction Method of Chinese Entities and Relations Based on RoBERTa and Pointer Network
- Author(s):
-
CHEN Yan1; 2 ; WEI Zijun2 ; LIAO Yuxiang2 ; TAN Zhixiang2 ; HU Xiaochun3; 4 ; SONG Ling2
-
1. Guangxi Key Laboratory of Digital Infrastructure, Guangxi Zhuang Autonomous Region Information Center, Nanning 530201 China; 2. School of Computer and Electronic Information, Guangxi University, Nanning 530004; 3. Guangxi Key Laboratory of Finance and Economics Big Data, Guangxi University of Finance and Economics, Nanning 530003, Guangxi; 4. School of Big Data and Artificial Intelligence, Guangxi University of Finance and Economics, Nanning 530003, Guangxi
-
- Keywords:
-
entity and relation joint extraction; RoBERTa; pointer network; natural language processing; deep learning
- CLC:
-
TP391. 1TP312
- DOI:
-
10.13705/j.issn.1671-6833.2025.05.007
- Abstract:
-
To effectively solve the problem of triple overlap in the joint extraction of entities and relations in unstructured text (SEO or EPO). This paper proposes a Chinese entity and relation joint extraction method based on RoBERTa and Pointer Network. Firstly, for the entity overlap problem, this paper designs an entity recognition module based on the pointer network, and constructs the entity recognition task as a “token-pair” recognition problem, which extracts all possible entities by recognizing the start and end positions of the entities. Secondly, for the triplet overlap problem, designing a relation extraction module based on the multi-head attention mechanism and Ptr-Net to construct the triple (s, r, o) extraction task as a quintuple (sₕ, sᵣ, r, oₕ, oᵣ) identification problem. Finally, Extensive experiments on the Chinese information extraction dataset DuIE show that the comprehensive performance of the proposed model is better than all baseline models, with the precision, recall and F1 value of 81.04%, 85.82% and 83.36%.