会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 4. 发明授权
    • Method for organizing large numbers of documents
    • 组织大量文件的方法
    • US08938461B2
    • 2015-01-20
    • US12839976
    • 2010-07-20
    • Yiftach RavidAmir Milo
    • Yiftach RavidAmir Milo
    • G06F17/30G06F17/22H04L12/58
    • G06K9/00483G06F17/2211G06F17/30011G06F17/30554G06F17/30598G06F17/30705G06K9/00469H04L51/16
    • A computer product including a data structure for organizing of a plurality of documents, and capable of being utilized by a processor for manipulating data of the data structure and capable of displaying selected data on a display unit. The data structure includes a plurality of directionally interlinked nodes, each node being associated with one or more documents having a header and body text. All the documents are associated with a given node and have identical normalized body text. All documents that have identical normalized body text are associated with the same node. One or more of the nodes is associated with more than one document. For any node that is a descendent of another node, the normalized body text of each document associated with the node is inclusive of the normalized body text of a document that is associated with the other node.
    • 一种计算机产品,包括用于组织多个文档的数据结构,并且能够被处理器用于操纵数据结构的数据并且能够在显示单元上显示所选择的数据。 数据结构包括多个定向互连的节点,每个节点与一个或多个具有头部和正文的文档相关联。 所有文档都与给定节点相关联,并具有相同的标准化正文。 具有相同标准化正文的所有文档都与同一个节点相关联。 一个或多个节点与多个文档相关联。 对于作为另一个节点的后代的任何节点,与节点相关联的每个文档的标准化正文文本包括与另一个节点相关联的文档的标准化正文文本。
    • 5. 发明申请
    • DETERMINING NEAR DUPLICATE
    • 确定近乎重要的“噪音”数据对象
    • US20100150453A1
    • 2010-06-17
    • US12161775
    • 2007-01-25
    • Yiftach RavidAmir Milo
    • Yiftach RavidAmir Milo
    • G06K9/68G06K9/40
    • G06F17/2211G06K9/03
    • A system configured to find near duplicate documents. For each two (or more) documents that are similar to each other, the system is configured to identify which of the differences is likely to be generated by an Optical Character Recognition software or otherwise due to difference between the original documents. As a result, the process of identifying similarity between documents is improved by identifying documents that were originally exact duplicates but are different one with respect to the other only due to OCR errors, or correct the similarity level between the documents by correcting errors introduced by the OCR tool.
    • 配置为找到近重复文档的系统。 对于彼此相似的每两个(或更多)个文档,系统被配置为识别光学字符识别软件可能产生哪些差异,或者由于原始文档之间的差异来识别其中的哪一个差异。 结果,通过识别原始精确重复的文档,但是仅由于OCR错误而相对于另一个的文档而改进了文档之间的相似性的过程,或者通过校正由文档引入的错误来校正文档之间的相似性级别 OCR工具。
    • 6. 发明授权
    • Method for organizing large numbers of documents
    • 组织大量文件的方法
    • US08825673B2
    • 2014-09-02
    • US12667664
    • 2008-07-02
    • Yiftach RavidAmir Milo
    • Yiftach RavidAmir Milo
    • G06F17/30G06F17/22H04L12/58
    • G06K9/00483G06F17/2211G06F17/30011G06F17/30554G06F17/30598G06F17/30705G06K9/00469H04L51/16
    • A computer product including a data structure for organizing of a plurality of documents, and capable of being utilized by a processor for manipulating data of the data structure and capable of displaying selected data on a display unit. The data structure includes a plurality of directionally interlinked nodes, each node being associated with one or more documents having a header and body text. All the documents are associated with a given node and have identical normalized body text. All documents that have identical normalized body text are associated with the same node. One or more of the nodes is associated with more than one document. For any node that is a descendent of another node, the normalized body text of each document associated with the node is inclusive of the normalized body text of a document that is associated with the other node.
    • 一种计算机产品,包括用于组织多个文档的数据结构,并且能够被处理器用于操纵数据结构的数据并且能够在显示单元上显示所选择的数据。 数据结构包括多个定向互连的节点,每个节点与一个或多个具有头部和正文的文档相关联。 所有文档都与给定节点相关联,并具有相同的标准化正文。 具有相同标准化正文的所有文档都与同一个节点相关联。 一个或多个节点与多个文档相关联。 对于作为另一个节点的后代的任何节点,与节点相关联的每个文档的标准化主体文本包括与另一个节点相关联的文档的标准化主体文本。
    • 7. 发明授权
    • System for enhancing expert-based computerized analysis of a set of digital documents and methods useful in conjunction therewith
    • 用于加强与一起有用的一组数字文档和方法的基于专家的计算机化分析的系统
    • US08533194B1
    • 2013-09-10
    • US13161087
    • 2011-06-15
    • Yiftach RavidTal Sterenzy
    • Yiftach RavidTal Sterenzy
    • G06F7/00
    • G06N99/005
    • An electronic document analysis method using a processor for analyzing N electronic documents, the method comprising providing a set of control electronic documents from among the electronic N documents; and using the set of control electronic documents and a processor to evaluate at least one aspect of a computerized text-classifier based electronic document categorization process performed on the N documents including computation of at least one statistic; wherein providing includes providing an initial set of control electronic documents; computing, using a processor, an estimated validation level of the at least one statistic assuming the initial set is used, and comparing the estimated validation level to a desired validation level, using a processor, and enlarging the initial set of control electronic documents if the estimated validation level falls below the desired validation level.
    • 一种使用处理器分析N个电子文档的电子文档分析方法,所述方法包括从所述电子N文档中提供一组控制电子文档; 以及使用所述一组控制电子文档和处理器来评估对所述N个文档执行的基于计算机化文本分类器的电子文档分类过程的至少一个方面,包括至少一个统计量的计算; 其中提供包括提供一组初始控制电子文档; 使用处理器来计算假设使用初始集合的至少一个统计量的估计验证级别,并且使用处理器将估计的验证级别与期望的验证级别进行比较,并且如果所述控制电子文档的初始集合放大 估计验证水平低于期望的验证水平。
    • 8. 发明授权
    • Method for determining near duplicate data objects
    • 确定近重复数据对象的方法
    • US08015124B2
    • 2011-09-06
    • US11572441
    • 2005-07-07
    • Amir MiloYiftach Ravid
    • Amir MiloYiftach Ravid
    • G06F15/18
    • G06F17/30705G06F17/2211
    • A system for determining that a document B is a candidate for near duplicate to a document A with a given similarity level th. The system includes a storage for providing two different functions on the documents, each function having a numeric function value. The system further includes a processor associated with the storage and configured to determine that the document B is a candidate for near duplicate to the document A, if a condition is met. The condition includes: for any function ƒi from among the two functions, ƒi(A)−ƒi(B)≦δi(ƒ,A,th).
    • 用于确定文档B是具有给定相似度级别th的与文档A近似重复的候选的系统。 该系统包括用于在文档上提供两个不同功能的存储器,每个功能具有数字功能值。 该系统还包括与存储器相关联的处理器,并且被配置为如果满足条件,则确定文档B是与文档A近似重复的候选者。 条件包括:对于两个函数中的任何函数ƒi,ƒi(A)-fi(B)≦̸δi(ƒ,A,th)。
    • 9. 发明授权
    • System and method for computerized batching of huge populations of electronic documents
    • 大量电子文件批量化的系统和方法
    • US09002842B2
    • 2015-04-07
    • US13569752
    • 2012-08-08
    • Yiftach Ravid
    • Yiftach Ravid
    • G06F17/30
    • G06F17/30598G06F17/30011G06F17/3071
    • A method for computerized batching of huge populations of electronic documents, including computerized assignment of electronic documents into at least one sequence of electronic document batches such that each document is assigned to a batch in the sequence of batches and such that there is no conflict between batching requirements, the following batching requirements being maintained by a suitably programmed processor: a. pre-defined subsets of documents are always kept together in the same batch, b. batches are equal in size, c. the population is partitioned into clusters, and all documents in any given batch belong to a single cluster rather than to two or more clusters.
    • 一种用于对大量电子文件进行计算机批量化的方法,包括将电子文档计算机化分成至少一个电子文档批次序列,使得每个文档按批次分配给批次,并且使得批处理之间不存在冲突 要求,以下配料要求由适当编程的处理器维护:a。 预定义的文件子集始终保持在同一批次中,b。 批次大小相等,c。 人口被划分成群集,任何给定批处理中的所有文档属于单个群集,而不是两个或更多个群集。
    • 10. 发明授权
    • System for enhancing expert-based computerized analysis of a set of digital documents and methods useful in conjunction therewith
    • 用于加强与一起有用的一组数字文档和方法的基于专家的计算机化分析的系统
    • US08706742B1
    • 2014-04-22
    • US13342770
    • 2012-01-03
    • Yiftach RavidLiad Tal-Rothschild
    • Yiftach RavidLiad Tal-Rothschild
    • G06F17/30G06F7/00
    • G06N5/04
    • A system including an electronic repository having a multiplicity of accesses to a respective multiplicity of electronic documents and metadata; a document rater using a processor to run a first computer algorithm on the multiplicity of electronic documents which yields a score which rates each of the multiplicity of electronic documents to an issue; and a metadata-based document discriminator to run a second computer algorithm on at least some of the metadata which yields leads, each lead having at least one metadata value for at least one metadata parameter, whose value correlates with the score of the electronic documents to the issue, typically used in combination with an electronic document analysis method receiving N electronic documents pertaining to a case encompassing a set of issues including at least one issue and establishing relevance of at least the N documents to at least one individual issue in the set of issues.
    • 一种包括具有对相应多个电子文档和元数据的多次访问的电子存储库的系统; 使用处理器对多个电子文档运行第一计算机算法的文档评分器,其产生将多个电子文档中的每一个评级为问题的分数; 以及基于元数据的文档鉴别器,用于在产生线索的至少一些元数据上运行第二计算机算法,每个线索具有用于至少一个元数据参数的至少一个元数据值,其值与电子文档的得分相关, 该问题通常与电子文档分析方法结合使用,该电子文档分析方法接收与涉及包括至少一个问题的一系列问题的案件有关的N个电子文件,并且将至少N个文档的至少一个相关性建立在该组 问题