[1]吴相锦,张忠林,周生龙.古文献手写汉字切分方法研究[J].郑州大学学报(工学版),2015,36(06):70.[doi:10.3969/j. issn.1671-6833.2015.06.014]
ZHANC Zhonglin,WU Xiangjin,ZHOU Shenglong.A Study on the Segmentation method of Handwritten Characters from historical Chinese documents[J].Journal of Zhengzhou University (Engineering Science),2015,36(06):70.[doi:10.3969/j. issn.1671-6833.2015.06.014]
点击复制
古文献手写汉字切分方法研究()
《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]
- 卷:
-
36
- 期数:
-
2015年06期
- 页码:
-
70
- 栏目:
-
- 出版日期:
-
2015-12-25
文章信息/Info
- Title:
-
A Study on the Segmentation method of Handwritten Characters from historical Chinese documents
- 作者:
-
吴相锦; 张忠林; 周生龙
-
1.兰州交通大学电子与信息工程学院,甘肃兰州730070;2.甘肃省图书馆,甘肃兰州730000
- Author(s):
-
ZHANC Zhonglin1; WU Xiangjin1; ZHOU Shenglong2
-
1.College of Electronics and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China;2.Cansu Provin-cial Library,Lanzhou 730000,China
-
- 关键词:
-
古文献; 手写汉字; 汉字切分; 分割算法
- Keywords:
-
historical chinese documents; handwritten chinese characters; chinese character segmentation; segmentation algorithm
- 分类号:
-
TP391
- DOI:
-
10.3969/j. issn.1671-6833.2015.06.014
- 文献标志码:
-
A
- 摘要:
-
根据古文献和古汉字的多重叠、多粘连等特点,提出了适合古文献的列切分和字切分方法.列切分采用统计投影循环过滤方法,首先对古文献进行纵向上的统计投影,然后采用循环过滤的方法对统计结果进行处理直到分离出比较均匀的列.该算法在噪点较多、有一定倾斜.列高度不均匀等多种复杂情况下,取得了很好的效果.字切分采用投影﹑分段投影和顶底部笔画特征相结合的多步切分方法,并在此基础上采用上下文相结合的方法进行切分检验,对不正确的切分进行调整.分段投影采用二分的思想把存在粘连﹑重叠的字段分为左右两部分,分别进行投影,并分析投影数组获取字段的切分路径;顶底部笔画特征切分法是根据汉字顶底部笔画的特点找到过度切分和不足切分,依次对切分进行调整.实验结果表明,提出的方法能较好地用于手写古文献的切分.
- Abstract:
-
In this paper,we propose methods of text line and character segmentation,which suit the charac-teristics of ancient documents and handwritten characters of China,such as longitudinal writing,overlapping,conglutination and so on.For line segmentation,a method called statistical projection filtering is proposed.Firstly,we count up the vertical projection of ancient documents,then adopt the method of loop filter to dealwith statistical results until much uniform columns are isolated.Even in some complex cases,like much noise,certain inclined and column height is not uniform,our algorithm still has good performance. The methods ofprojection,piecewise projection and segmentation of strokes features at top and bottom are applied to charactersegmentation. Finally,the context combined method are adopted to test the segmentation,then, the mistakensegmentation is adjusted.Using the idea of dichotomy,piecewise projection divide characters,where exist o-verlap and adhesion exist,into two parts,then projected respectively. After that,analyzing projection arrays,we get segmentation path. After finding the over - segmentation and under - segmentation by SM -SFTB ( thesegmentation method of strokes features at top and bottom) using the characteristics of Chinese characterstrokes,the adjustment for segmentation is possible.The experimental results show that the proposed methodshave good performance for historical Chinese documents.
更新日期/Last Update: