当前位置: 中文主页 > 科学研究 > 论文成果

李颖玉

Personal profile

个人简介

暂未填写

论文成果

Fast document image comparison in multilingual corpus without OCR

发布时间:2025-04-30  点击次数:

发布时间:2025-04-30

论文名称:Fast document image comparison in multilingual corpus without OCR

发表刊物:Multimedia Systems

摘要:This paper proposes a method to compare document images in multilingual corpus, which is composed of character segmentation, feature extraction and similarity measure. In character segmentation, a top-down strategy is used. We apply projection and self-adaptive threshold to analyze the layout and then segment the text line by horizontal projection. Then, English, Chinese and Japanese are recognized by different methods based on the distribution
and ratios of text line. Finally, character segmentation with different languages is done using different strategies.
In feature extraction and similarity measure, four features are given for coarse measurement, and then a template is
set up. Based on the templates, a fast template matching method based on coarse-to-fine strategy and bit memory is
presented for precise matching. The experimental results demonstrate that our method can handle multilingual document images of different resolutions and font sizes with high precision and speed.

DOI 10.1007/s00530-015-0484-3

合写作者:Yuping Lin, Yingyu Li,et al

是否译文:

发表时间:2015-10-08

访问量:    最后更新时间:--