发明公开
EP2662802A1 Method and system for preprocessing an image for optical character recognition
无效 - 撤回
![Method and system for preprocessing an image for optical character recognition](/ep/2013/11/13/EP2662802A1/abs.jpg.150x150.jpg)
基本信息:
- 专利标题: Method and system for preprocessing an image for optical character recognition
- 专利标题(中):Verfahren und System zur Vorverabeitung eines Bildes zur optischen Zeichenerkennung
- 申请号:EP13162939.6 申请日:2013-04-09
- 公开(公告)号:EP2662802A1 公开(公告)日:2013-11-13
- 发明人: Al-Omari, Hussein Khalid , Khorsheed, Mohammad Sulaiman
- 申请人: King Abdulaziz City for Science & Technology (KACST)
- 申请人地址: P.O. Box 6086 11442 Riyadh SA
- 专利权人: King Abdulaziz City for Science & Technology (KACST)
- 当前专利权人: King Abdulaziz City for Science & Technology (KACST)
- 当前专利权人地址: P.O. Box 6086 11442 Riyadh SA
- 代理机构: Goddar, Heinz J.
- 优先权: US201213467873 20120509
- 主分类号: G06K9/00
- IPC分类号: G06K9/00
摘要:
A method and system for preprocessing an image, wherein the image includes a plurality of columns, or regions, of text is disclosed. A plurality of components associated with the text is determined. On determining the plurality of components, a line height and a column spacing is determined for the components. The components are then associated with a column based on the line height and the column spacing. A set of characteristic parameters are calculated for each column and the plurality of components of each column are merged based on the characteristic parameters to form sub-words and words. A first plurality of words and/or subwords is merged and processed as a first region and a second plurality of words and/or subwords is merged and processed as a second region wherein at least a portion of the second region vertically overlaps at least a portion of the first region.
摘要(中):
一种用于预处理图像的方法和系统,其中所述图像包括文本的多个列或区域。 确定与文本相关联的多个组件。 在确定多个组件时,确定组件的行高和列间距。 然后根据线高度和列间距将组件与列相关联。 针对每列计算一组特征参数,并且基于特征参数合并每列的多个分量以形成子词和词。 第一多个单词和/或子词被合并和处理为第一区域,并且第二多个单词和/或子词被合并和处理为第二区域,其中第二区域的至少一部分垂直重叠至少一部分 的第一个地区。
IPC结构图谱:
G | 物理 |
--G06 | 计算;推算;计数 |
----G06K | 数据识别;数据表示;记录载体;记录载体的处理 |
------G06K9/00 | 用于阅读或识别印刷或书写字符或者用于识别图形,例如,指纹的方法或装置 |