会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明申请
    • RE-DIGITIZATION AND ERROR CORRECTION OF ELECTRONIC DOCUMENTS
    • 电子文件的重新定位和错误校正
    • WO2013165334A1
    • 2013-11-07
    • PCT/US2012/035718
    • 2012-04-29
    • HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.SIMSKE, Steven J.LIU, Samson J.
    • SIMSKE, Steven J.LIU, Samson J.
    • G06K9/03G06K7/10G06K9/20
    • G06K9/18G06K9/00442G06K9/03
    • A system and method to error correct extant electronic documents is disclosed. An electronic document may be rasterized to obtain a pixel representation of the electronic document (e.g., raster image). One or more optical character recognition (OCR) tasks may be performed on the raster image of the electronic document. Errors discovered by the OCR tasks may be corrected and a customized error corrected version of the electronic document may be created and stored. If the author of the electronic document is known, the raster image may be compared to a personalized tf*idf error dictionary associated with the author to determine known OCR errors specific to the author. The raster image may also be compared to a personalized electronic error dictionary associated with the author to determine known typographical errors specific to the author.
    • 公开了一种错误纠正现有电子文档的系统和方法。 电子文档可以被光栅化以获得电子文档的像素表示(例如,光栅图像)。 可以在电子文档的光栅图像上执行一个或多个光学字符识别(OCR)任务。 可能会纠正由OCR任务发现的错误,并且可以创建和存储电子文档的定制错误更正版本。 如果电子文档的作者是已知的,则光栅图像可以与与作者相关联的个性化tf * idf错误字典进行比较,以确定作者特有的已知OCR错误。 也可以将光栅图像与与作者相关联的个性化电子错误字典进行比较,以确定作者特有的已知印刷错误。