ZHANC Zhonglin1,WU Xiangjin1,ZHOU Shenglong2
Abstract:
In this paper,we propose methods of text line and character segmentation,which suit the charac-teristics of ancient documents and handwritten characters of China,such as longitudinal writing,overlapping,conglutination and so on.For line segmentation,a method called statistical projection filtering is proposed.Firstly,we count up the vertical projection of ancient documents,then adopt the method of loop filter to dealwith statistical results until much uniform columns are isolated.Even in some complex cases,like much noise,certain inclined and column height is not uniform,our algorithm still has good performance. The methods ofprojection,piecewise projection and segmentation of strokes features at top and bottom are applied to charactersegmentation. Finally,the context combined method are adopted to test the segmentation,then, the mistakensegmentation is adjusted.Using the idea of dichotomy,piecewise projection divide characters,where exist o-verlap and adhesion exist,into two parts,then projected respectively. After that,analyzing projection arrays,we get segmentation path. After finding the over - segmentation and under - segmentation by SM -SFTB ( thesegmentation method of strokes features at top and bottom) using the characteristics of Chinese characterstrokes,the adjustment for segmentation is possible.The experimental results show that the proposed methodshave good performance for historical Chinese documents.