专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08594422B2 Page layout determination of an image undergoing optical character recognition 有权
标题翻译：正在进行光学字符识别的图像的页面布局确定
公开(公告)号：US08594422B2
公开(公告)日：2013-11-26
申请号：US12721949
申请日：2010-03-11
申请人： Mircea Cimpoi , Sasa Galic , Milan Vugdelija
发明人： Mircea Cimpoi , Sasa Galic , Milan Vugdelija
IPC分类号： G06K9/00
CPC分类号： G06K9/18 , G06K9/00463 , G06K9/3216 , G06K2209/01
摘要： A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order.
摘要翻译：提供了一种用于识别包括文本区域的图像的页面布局的方法和系统。文本区域将进行光学字符识别（OCR）。该系统包括输入组件，其接收包括已经形成边界框的单词的输入图像和将单词分组成多个文本区域的文本识别组件。阅读线组件将每个文本区域内的单词分组成阅读行。文本区域排序组件，按照读取顺序对文本区域进行排序。

2. 发明申请

US20110222771A1 PAGE LAYOUT DETERMINATION OF AN IMAGE UNDERGOING OPTICAL CHARACTER RECOGNITION 有权
标题翻译：图像识别光学字符识别的页面布局确定
公开(公告)号：US20110222771A1
公开(公告)日：2011-09-15
申请号：US12721949
申请日：2010-03-11
申请人： Mircea Cimpoi , Sasa Galic , Milan Vugdelija
发明人： Mircea Cimpoi , Sasa Galic , Milan Vugdelija
IPC分类号： G06K9/34
CPC分类号： G06K9/18 , G06K9/00463 , G06K9/3216 , G06K2209/01
摘要： A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order.
摘要翻译：提供了一种用于识别包括文本区域的图像的页面布局的方法和系统。文本区域将进行光学字符识别（OCR）。该系统包括接收包括已经形成边界框的单词的输入图像的输入组件和将单词分组成多个文本区域的文本识别组件。阅读线组件将每个文本区域内的单词分组成阅读行。文本区域排序组件，按照读取顺序对文本区域进行排序。

3. 发明申请

US20110222769A1 DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION 有权
标题翻译：光学字符识别中的文档分页
公开(公告)号：US20110222769A1
公开(公告)日：2011-09-15
申请号：US12720943
申请日：2010-03-10
申请人： Sasa Galic , Bogdan Radakovic , Nikola Todic
发明人： Sasa Galic , Bogdan Radakovic , Nikola Todic
IPC分类号： G06K9/34 , G06K9/72
CPC分类号： G06K9/00456 , G06K9/3283 , G06K9/38 , G06K9/4604 , G06K2209/01
摘要： Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters “sit”) and mean line (the line under which most of the characters “hang”). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.
摘要翻译：执行光学字符识别处理中的页面分割以检测文本对象和/或图像对象。通过选择作为水平相邻连接分量的集合（即，来自集合的每个像素与集合中的每个像素与集合中的所有剩余像素连接的图像像素的集合），选择具有相似垂直方向的本机线的候选，来检测输入灰度图像中的文本对象由基准值（大多数文本字符“坐”的行）和平均线（大多数字符“挂起”的行）定义的统计信息。对本地候选人执行二进制分类，以便通过审查任何嵌入规律性将其分类为文本或非文本。通过使用检测到的文本检测图像的背景以定义背景来间接检测图像对象。一旦检测到背景，剩余的（即非背景）是图像对象。

4. 发明授权

US08526732B2 Text enhancement of a textual image undergoing optical character recognition 有权
标题翻译：正在进行光学字符识别的文字图像的文本增强
公开(公告)号：US08526732B2
公开(公告)日：2013-09-03
申请号：US12720732
申请日：2010-03-10
申请人： Sasa Galic , Djordje Nijemcevic , Bodin Dresevic
发明人： Sasa Galic , Djordje Nijemcevic , Bodin Dresevic
IPC分类号： G06K9/00
CPC分类号： G06K9/4638 , G06K9/38 , G06K2209/01 , G06K2209/015
摘要： A method for enhancing a textual image for undergoing optical character recognition begins by receiving an image that includes native lines of text. A background line profile is determined which represents an average background intensity along the native lines in the image. Likewise, a foreground line profile is determined which represents an average foreground background intensity along the native lines in the image. The pixels in the image are assigned to either a background or foreground portion of the image based at least in part on the background line profile and the foreground line profile. The intensity of the pixels designated to the background portion of the image is adjusted to a maximum brightness so as to represent a portion of the image that does not include text.
摘要翻译：用于增强用于进行光学字符识别的文本图像的方法通过接收包括原生文本行的图像开始。确定背景线轮廓，其表示沿着图像中的原生线的平均背景强度。同样，确定前景线轮廓，其表示沿着图像中的本机线的平均前景背景强度。至少部分地基于背景线轮廓和前景线轮廓，将图像中的像素分配给图像的背景或前景部分。将指定给图像的背景部分的像素的强度调整到最大亮度，以便表示不包括文本的图像的一部分。

5. 发明授权

US08380009B2 Resolution adjustment of an image that includes text undergoing an OCR process 有权
标题翻译：包含正在进行OCR过程的文本的图像的分辨率调整
公开(公告)号：US08380009B2
公开(公告)日：2013-02-19
申请号：US12721705
申请日：2010-03-11
申请人： Sasa Galic
发明人： Sasa Galic
IPC分类号： G06K9/32 , G06K15/02
CPC分类号： G06K9/42 , G06K2209/01
摘要： A system and method is provided which rescales a received image to an optimal size to undergo an optical character recognition (OCR) process. The system includes an optimal size determination component that determines an optimum size for the image such that processing time of the received image is minimized without affecting accuracy. The optimal size determination component determines the optimum size of the image based at least in part on a dominant interline spacing of text and a dominant text height. The system also includes a rescaling component that resizes the received image to the determined optimum size.
摘要翻译：提供了一种系统和方法，其将接收到的图像重新调整为最佳尺寸以进行光学字符识别（OCR）处理。该系统包括确定图像的最佳尺寸的最佳尺寸确定部件，使得接收图像的处理时间最小化而不影响精度。最优尺寸确定组件至少部分地基于文本的主导间距和主要文本高度来确定图像的最佳尺寸。该系统还包括重新缩放组件，其将接收到的图像的大小调整到确定的最佳尺寸。

6. 发明授权

US08189961B2 Techniques in optical character recognition 有权
标题翻译：光学字符识别技术
公开(公告)号：US08189961B2
公开(公告)日：2012-05-29
申请号：US12797219
申请日：2010-06-09
申请人： Djordje Nijemcevic , Sasa Galic
发明人： Djordje Nijemcevic , Sasa Galic
IPC分类号： G06K9/00
CPC分类号： G06K9/3283 , G06K2209/01
摘要： An image deskew system and techniques are used in the context of optical character recognition. An image is obtained of an original set of characters in an original linear (horizontal) orientation. An acquired set of characters, which is skewed relative to the original linear orientation by a rotation angle, is represented by pixels of the image. The rotation angle is estimated, and a confidence value may be associated with the estimation, to determine whether to deskew the image. In connection with rotation angle estimation, an edge detection filter is applied to the acquired set of characters to produce an edge map, which is input to a linear hough transform filter to produce a set of output lines in parametric form. The output lines are assigned scores, and based on the scores, at least one output line is determined to be a dominant line with a slope approximating the rotation angle.
摘要翻译：在光学字符识别的上下文中使用图像校正系统和技术。以原始的线性（水平）方向获得原始的一组字符的图像。通过图像的像素来表示相对于原始线性方向偏斜旋转角度的所获取的一组字符。估计旋转角度，并且置信度值可以与估计相关联，以确定是否使图像偏斜。结合旋转角度估计，将边缘检测滤波器应用于所获取的字符集，以产生边缘图，其被输入到线性霍夫变换滤波器以产生一组参数形式的输出线。输出线被分配分数，并且基于分数，至少一条输出线被确定为具有接近旋转角度的斜率的主导线。

7. 发明申请

US20110243445A1 DETECTING POSITION OF WORD BREAKS IN A TEXTUAL LINE IMAGE 有权
标题翻译：检测文字线图像中的字符位置
公开(公告)号：US20110243445A1
公开(公告)日：2011-10-06
申请号：US12749599
申请日：2010-03-30
申请人： Aleksandar Uzelac , Bodin Dresevic , Sasa Galic , Bogdan Radakovic
发明人： Aleksandar Uzelac , Bodin Dresevic , Sasa Galic , Bogdan Radakovic
IPC分类号： G06K9/34 , G06K9/18
CPC分类号： G06K9/344 , G06K9/342 , G06K2209/01
摘要： Line segmentation in an OCR process is performed to detect the positions of words within an input textual line image by extracting features from the input to locate breaks and then classifying the breaks into one of two break classes which include inter-word breaks and inter-character breaks. An output including the bounding boxes of the detected words and a probability that a given break belongs to the identified class can then be provided to downstream OCR or other components for post-processing. Advantageously, by reducing line segmentation to the extraction of features, including the position of each break and the number of break features, and break classification, the task of line segmentation is made less complex but with no loss of generality.
摘要翻译：执行OCR处理中的线分割以通过从输入中提取特征来定位分组，然后将分组分类成包括字间间隔和字符间的两个断点类之一来检测输入文本行图像内的单词的位置休息然后可以将包括检测到的单词的边界框和给定中断属于所识别的类别的概率的输出提供给下游OCR或用于后处理的其他组件。有利的是，通过将行分割减少到特征的提取，包括每个断点的位置和断裂特征的数量以及断裂分类，线分割的任务变得不那么复杂，但不失一般性。

8. 发明申请

US20110222773A1 PARAGRAPH RECOGNITION IN AN OPTICAL CHARACTER RECOGNITION (OCR) PROCESS 有权
标题翻译：光学识别（OCR）过程中的符号识别
公开(公告)号：US20110222773A1
公开(公告)日：2011-09-15
申请号：US12720992
申请日：2010-03-10
申请人： Bogdan Radakovic , Sasa Galic , Aleksandar Uzelac
发明人： Bogdan Radakovic , Sasa Galic , Aleksandar Uzelac
IPC分类号： G06K9/18 , G06K9/62
CPC分类号： G06K9/00469 , G06K9/00463
摘要： An image processing apparatus for detecting paragraphs in a textual image includes an input component for receiving an input image in which textual lines and words have been identified and a page classification component for classifying the input image as a first or second page type. The apparatus also includes a paragraph detection component for classifying all textual lines on the input image as a beginning paragraph line or a continuation paragraph line. The apparatus is also provided with a paragraph creation component for creating paragraphs that include textual lines between two successive beginning paragraph lines, including a first of the two successive beginning paragraph lines. The paragraphs that have been identified may be classified by the type of alignment they exhibit. For instance, paragraphs may be classified according to whether they are left aligned, right aligned, center aligned or justified.
摘要翻译：用于检测文本图像中的段落的图像处理装置包括用于接收其中已经识别了文本行和单词的输入图像的输入组件和用于将输入图像分类为第一或第二页面类型的页面分类组件。该装置还包括段落检测部件，用于将输入图像上的所有文本行分类为起始段落线或连续段落线。该装置还具有段落创建部件，用于创建包括两个连续起始段落线之间的文本行的段落，包括两个连续起始段落行中的第一行。已确定的段落可以按照它们展示的对齐方式进行分类。例如，段落可以根据它们是否对齐，右对齐，中心对齐或对齐来进行分类。

9. 发明申请

US20110222768A1 TEXT ENHANCEMENT OF A TEXTUAL IMAGE UNDERGOING OPTICAL CHARACTER RECOGNITION 有权
标题翻译：纹理识别的文字图像的文本增强
公开(公告)号：US20110222768A1
公开(公告)日：2011-09-15
申请号：US12720732
申请日：2010-03-10
申请人： Sasa Galic , Djordje Nijemcevic , Bodin Dresevic
发明人： Sasa Galic , Djordje Nijemcevic , Bodin Dresevic
IPC分类号： G06K9/46 , G06K9/00
CPC分类号： G06K9/4638 , G06K9/38 , G06K2209/01 , G06K2209/015
摘要： A method for enhancing a textual image for undergoing optical character recognition begins by receiving an image that includes native lines of text. A background line profile is determined which represents an average background intensity along the native lines in the image. Likewise, a foreground line profile is determined which represents an average foreground background intensity along the native lines in the image. The pixels in the image are assigned to either a background or foreground portion of the image based at least in part on the background line profile and the foreground line profile. The intensity of the pixels designated to the background portion of the image is adjusted to a maximum brightness so as to represent a portion of the image that does not include text.
摘要翻译：用于增强用于进行光学字符识别的文本图像的方法通过接收包括原生文本行的图像开始。确定背景线轮廓，其表示沿着图像中的原生线的平均背景强度。同样，确定前景线轮廓，其表示沿着图像中的本机线的平均前景背景强度。至少部分地基于背景线轮廓和前景线轮廓，将图像中的像素分配给图像的背景或前景部分。将指定给图像的背景部分的像素的强度调整到最大亮度，以便表示不包括文本的图像的一部分。

10. 发明申请

US20110305393A1 TECHNIQUES IN OPTICAL CHARACTER RECOGNITION 有权
标题翻译：光学识别技术
公开(公告)号：US20110305393A1
公开(公告)日：2011-12-15
申请号：US12797219
申请日：2010-06-09
申请人： Djordje Nijemcevic , Sasa Galic
发明人： Djordje Nijemcevic , Sasa Galic
IPC分类号： G06K9/18
CPC分类号： G06K9/3283 , G06K2209/01
摘要： An image deskew system and techniques are used in the context of optical character recognition. An image is obtained of an original set of characters in an original linear (horizontal) orientation. An acquired set of characters, which is skewed relative to the original linear orientation by a rotation angle, is represented by pixels of the image. The rotation angle is estimated, and a confidence value may be associated with the estimation, to determine whether to deskew the image. In connection with rotation angle estimation, an edge detection filter is applied to the acquired set of characters to produce an edge map, which is input to a linear hough transform filter to produce a set of output lines in parametric form. The output lines are assigned scores, and based on the scores, at least one output line is determined to be a dominant line with a slope approximating the rotation angle.
摘要翻译：在光学字符识别的上下文中使用图像校正系统和技术。以原始的线性（水平）方向获得原始的一组字符的图像。通过图像的像素来表示相对于原始线性方向偏斜旋转角度的所获取的一组字符。估计旋转角度，并且置信度值可以与估计相关联，以确定是否使图像偏斜。结合旋转角度估计，将边缘检测滤波器应用于所获取的字符集，以产生边缘图，其被输入到线性霍夫变换滤波器以产生一组参数形式的输出线。输出线被分配分数，并且基于分数，至少一条输出线被确定为具有接近旋转角度的斜率的主导线。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式