会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 4. 发明申请
    • Integration and Combination of Random Sampling and Document Batching
    • 随机抽样与文献分类的整合与组合
    • US20130132394A1
    • 2013-05-23
    • US13745617
    • 2013-01-18
    • Jan Puzicha
    • Jan Puzicha
    • G06F17/30
    • G06F16/93G06Q10/103G06Q50/18
    • Methods and systems of integrated batching and random sampling of documents for enhanced functionality and quality control, such as validation, within a document review process are provided herein. According to various embodiments, a batching request may be received and may include a population size that corresponds to a total amount of documents available for sampling. The batching request may also include an acceptable margin of error. A random sample size may be calculated based on the batching request, and then a subset of documents corresponding to the random sample size may be selected from the total amount of documents available for sampling. The subset of documents may be grouped into one or more batches, and the one or more batches may be assigned to one or more review nodes.
    • 本文提供了在文档审查过程中对增强功能和质量控制(如验证)的文档的集成批处理和随机抽样的方法和系统。 根据各种实施例,可以接收批量请求,并且可以包括对应于可用于采样的文档的总量的总体大小。 批处理请求也可能包括可接受的误差范围。 可以基于批量请求计算随机样本大小,然后可以从可用于采样的文档的总量中选择与随机样本大小相对应的文档子集。 文档的子集可以被分组成一个或多个批次,并且一个或多个批次可以被分配给一个或多个审查节点。
    • 5. 发明授权
    • Systems and methods for predictive coding
    • 预测编码的系统和方法
    • US09595005B1
    • 2017-03-14
    • US13848023
    • 2013-03-20
    • Jan PuzichaSteve Vranas
    • Jan PuzichaSteve Vranas
    • G06F9/44G06N7/02G06N7/06G06N5/04
    • G06N99/005G06F17/30011G06N5/04G06N5/048G06N7/005
    • Systems and methods for analyzing documents are provided herein. A plurality of documents and user input are received via a computing device. The user input includes hard coding of a subset of the plurality of documents, based on an identified subject or category. Instructions stored in memory are executed by a processor to generate an initial control set, analyze the initial control set to determine at least one seed set parameter, automatically code a first portion of the plurality of documents based on the initial control set and the seed set parameter associated with the identified subject or category, analyze the first portion of the plurality of documents by applying an adaptive identification cycle, and retrieve a second portion of the plurality of documents based on a result of the application of the adaptive identification cycle test on the first portion of the plurality of documents.
    • 本文提供了分析文档的系统和方法。 通过计算设备接收多个文档和用户输入。 用户输入包括基于识别的主题或类别的多个文档的子集的硬编码。 存储在存储器中的指令由处理器执行以产生初始控制集,分析初始控制集以确定至少一个种子集参数,基于初始控制集和种子集自动地编码多个文档的第一部分 参数与所识别的对象或类别相关联,通过应用自适应识别周期来分析多个文档的第一部分,并且基于对自适应识别周期测试的应用的结果来检索多个文档的第二部分 多个文档的第一部分。
    • 6. 发明授权
    • Document relevancy analysis within machine learning systems including determining closest cosine distances of training examples
    • 机器学习系统中的文档相关性分析,包括确定训练样本的最近余弦距离
    • US08533148B1
    • 2013-09-10
    • US13632943
    • 2012-10-01
    • Christian FeuersängerDietrich WettschereckJan Puzicha
    • Christian FeuersängerDietrich WettschereckJan Puzicha
    • G06F15/00G06F15/18
    • G06F17/3053G06F17/30687G06F17/30705G06N99/005
    • Systems and methods that quantify document relevance for a document relative to a training corpus and select a best match or best matches are provided herein. Methods may include generating an example-based explanation for relevancy of a document to a training corpus by executing a support vector machine classifier, the support vector machine classifier performing a centroid classification of a relevant document in a term frequency-inverse document frequency features space relative to training examples in a training corpus, and generating an example-based explanation by selecting a best match for the relevant document from the training examples based upon the centroid classification. Determining the training example having the closest cosine distance to the relevant document includes ranking the training examples by stretching the internal best match scores for the training examples linearly to cover a complete unit interval.
    • 本文提供了量化文档相对于训练语料库的文档相关性并选择最佳匹配或最佳匹配的系统和方法。 方法可以包括通过执行支持向量机分类器来产生对文档与训练语料库的相关性的基于示例的解释,所述支持向量机分类器在术语频率逆文档频率特征空间相对位置中执行相关文档的质心分类 在训练语料库中训练示例,并且基于质心分类从训练示例中选择相关文档的最佳匹配来生成基于示例的解释。 确定具有与相关文档最近的余弦距离的训练示例包括通过线性地拉伸训练示例的内部最佳匹配分数来覆盖训练样本来对整个单位间隔进行排序。
    • 7. 发明授权
    • System and method for providing information navigation and filtration
    • 提供信息导航和过滤的系统和方法
    • US08024333B1
    • 2011-09-20
    • US12641118
    • 2009-12-17
    • Jan PuzichaThomas Hofmann
    • Jan PuzichaThomas Hofmann
    • G06F17/30
    • G06F17/30864G06F17/30011G06F17/3053
    • A system and method for information navigation and filtration is provided. One or more query terms are received from a user. A preliminary relevance of one or more objects associated with an enterprise system is determined based on the query terms. The preliminary relevance may be propagated between objects. At least one rating is assigned to the one or more objects based on the preliminary relevance. An overall relevance of the one or more objects is established based on the at least one rating. The one or more objects are ranked according to the overall relevance. Data is provided as search results comprised of the one or more objects according to the ranking to the user. The search results may then be filtered based on at least one selected, dynamically generated filter. The filtered search results may be dynamically generated and provided to the user.
    • 提供了一种用于信息导航和过滤的系统和方法。 从用户接收一个或多个查询词。 基于查询条件确定与企业系统相关联的一个或多个对象的初步相关性。 对象之间可能会传播初步的相关性。 基于初步相关性,至少一个等级被分配给一个或多个对象。 基于至少一个评级建立一个或多个对象的整体相关性。 一个或多个对象根据整体相关性进行排名。 根据用户的排名将数据提供为由一个或多个对象组成的搜索结果。 然后可以基于至少一个选择的动态生成的过滤器来过滤搜索结果。 过滤的搜索结果可以被动态地生成并提供给用户。
    • 9. 发明授权
    • Systems and methods for predictive coding
    • 预测编码的系统和方法
    • US08489538B1
    • 2013-07-16
    • US13624854
    • 2012-09-21
    • Jan PuzichaSteve Vranas
    • Jan PuzichaSteve Vranas
    • G06F9/44G06N7/02G06N7/06
    • G06N99/005G06F17/30011G06N5/04G06N5/048G06N7/005
    • Systems and methods for analyzing documents are provided herein. A plurality of documents and user input are received via a computing device. The user input includes hard coding of a subset of the plurality of documents, based on an identified subject or category. Instructions stored in memory are executed by a processor to generate an initial control set, analyze the initial control set to determine at least one seed set parameter, automatically code a first portion of the plurality of documents based on the initial control set and the seed set parameter associated with the identified subject or category, analyze the first portion of the plurality of documents by applying an adaptive identification cycle, and retrieve a second portion of the plurality of documents based on a result of the application of the adaptive identification cycle test on the first portion of the plurality of documents.
    • 本文提供了分析文档的系统和方法。 通过计算设备接收多个文档和用户输入。 用户输入包括基于识别的主题或类别的多个文档的子集的硬编码。 存储在存储器中的指令由处理器执行以产生初始控制集,分析初始控制集以确定至少一个种子集参数,基于初始控制集和种子集自动地编码多个文档的第一部分 参数与所识别的对象或类别相关联,通过应用自适应识别周期来分析多个文档的第一部分,并且基于对自适应识别周期测试的应用的结果来检索多个文档的第二部分 多个文档的第一部分。