A Boundary Smoothing-based Method for Chinese Medical Nested Named Entity Recognition

NAVIGATE

Table of Contents

STATISTICS

Viewed90

Downloads537

A Boundary Smoothing-based Method for Chinese Medical Nested Named Entity Recognition

PDF下载 (537)

[1]LIU Na,WU Kedong,LIU Lei,et al.A Boundary Smoothing-based Method for Chinese Medical Nested Named Entity Recognition[J].Journal of Zhengzhou University (Engineering Science),2027,48(XX):1-8.[doi:10.13705/j.issn.1671-6833.2026.02.014]

Copy

Journal of Zhengzhou University (Engineering Science)[ISSN 1671-6833/CN 41-1339/T] Volume: 48 Number of periods: 2027 XX Page number: 1-8 Column: Public date: 2027-12-10

Title:: A Boundary Smoothing-based Method for Chinese Medical Nested Named Entity Recognition

Author(s):: LIU Na ^1,2 , WU Kedong ^1,2 , LIU Lei ^1,2 , JI Zhe ^1,2 , ZHOU Xueyu^1,2; 1. College of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China; 2. The Key Laboratory of Images and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan 750021, China

Keywords:: nested NER; Chinese medical text; boundary prediction; boundary smoothing; pre-trained language model

CLC:: TP391. 1

DOI:: 10.13705/j.issn.1671-6833.2026.02.014

Abstract:: Medical corpora commonly exhibit multi-level and multi-granularity semantics with overlapping entities. Existing approaches tend to produce overconfident boundary predictions and insufficient modeling of boundary uncertainty, which hinders effective representation of nested relations among entities. Strengthening boundary prediction is therefore essential. A Chinese medical nested named entity recognition model based on boundary smoothing is developed, together with an improved span-encoding strategy to enhance recognition. The model uses RoBERTa-wwm-ext-large to obtain token-level representations and employs a BiLSTM to capture long-range dependencies. In the recognition layer, a GlobalPointer uniformly locates start and end boundaries, Rotary Position Embedding explicitly encodes relative positional information, and a biaffine decoder strengthens head-tail interactions for span-level discrimination. During training, boundary-smoothing regularization assigns soft labels to annotated spans and their neighboring spans according to distance, which suppresses hard-boundary noise and overconfidence and improves boundary calibration and recall. Experiments on CMeEE, CMeEE-V2, and CLUENER2020 show significant improvements in F1, confirming that the method effectively mitigates boundary uncertainty and nested interference in Chinese medical text, with strong accuracy and generalization.

References:: [1] Goyal N, Singh N. Named entity recognition and relationship extraction for biomedical text: a comprehensive survey, recent advancements, and future research directions[J]. Neurocomputing, 2025, 618: 129171.
[2] Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2016: 260-270.
[3] Hua Zaifeng, Chen Yifei. Local Metric NER: a new paradigm for named entity recognition from a multi-label perspective[J]. Knowledge-Based Systems, 2024, 305: 112686.
[4] Zheng Guofeng, Liu Na, Li Chen, et al. Chinese medical named entity recognition based on prompt tuning and contrastive learning[J/OL]. Computer Engineering and Applications. https://link.cnki.net/urlid/11.2127.tp.20240923.1435.004.
[5] Fan Jintao, Chen Yanping, Yang Caiwei, et al. Nested named entity recognition by contrastive learning with boundary information[J]. Journal of Computer Applications, 2025, 45(10): 3111-3120.
[6] Liu Xin, Xu Hongzhen, Liu Aihua, et al. Geological named entity recognition based on MacBERT and R-drop[J]. Journal of Zhengzhou University (Engineering Science), 2024, 45(3): 89-95.
[7] Yan Yang, Kang Yufeng, Huang Wenbo, et al. Chinese medical named entity recognition utilizing entity association and gate context awareness[J]. PLoS One, 2025, 20(2): e0319056.
[8] Yu Juntao, Bohnet B, Poesio M. Named entity recognition as dependency parsing[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 6470-6476.
[9] Su Jianlin, Murtadha A, Pan Shengfeng, et al. Global pointer: novel efficient span-based approach for named entity recognition[PP/OL]. (2022-08-05)[2025-04-10]. https://doi.org/10.48550/arXiv.2208.03054.
[10] Yan Jinghui, Zong Chengqing, Xu Jin’an. Nested entity recognition approach in Chinese medical text[J]. Journal of Software, 2024, 35(6): 2923-2935.
[11] Yang Caiwei, Chen Yanping, Qin Yongbin, et al. A multi-scale semantic convergence difference operator for named entity recognition[J]. Journal of Chinese Information Processing, 2025, 39(6): 99-109.
[12] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 2818-2826.
[13] Zhu Enwei, Li Jinpeng. Boundary smoothing for named entity recognition[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 7096-7108.
[14] Shen Yongliang, Song Kaitao, Tan Xu, et al. Diffusion-NER: boundary diffusion for named entity recognition[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2023: 3875-3890.
[15] Deng Zhenrong, Huang Zheng, Wei Shiwei, et al. KCB-FLAT: enhancing Chinese named entity recognition with syntactic information and boundary smoothing techniques[J]. Mathematics, 2024, 12(17): 2714.
[16] Gao Kai, Zhou Jiahao, Chi Yunxian, et al. TourismNER: a Tourism Named Entity Recognition method based on entity boundary joint prediction[J]. Intelligent Systems with Applications, 2025, 25: 200475.
[17] Zhang Ningyu, Chen Mosha, Bi Zhen, et al. CBLUE: a Chinese biomedical language understanding evaluation benchmark[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 7888-7915.
[18] Xu Liang, Hu Hai, Zhang Xuanwei, et al. CLUE: a Chinese language understanding evaluation benchmark[C]//Proceedings of the 28th International Conference on Computational Linguistics. Barcelona: International Committee on Computational Linguistics, 2020: 4762-4772.
[19] Wang Jue, Shou Lidan, Chen Ke, et al. Pyramid: a layered model for nested named entity recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 5918-5928.
[20] Cui Xiaohui, Yang Yu, Li Dongmei, et al. Fusion of SoftLexicon and RoBERTa for purpose-driven electronic medical record named entity recognition[J]. Applied Sciences, 2023, 13(24): 13296.
[21] Guo Qujiang, Dong Yihong, Tian Ling, et al. BANER: boundary-aware LLMs for few-shot named entity recognition[C]//Proceedings of the 31st International Conference on Computational Linguistics. Kerrville: Association for Computational Linguistics, 2025: 10375-10389.

Similar References:

Memo

Last Update: 2026-04-03