会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Method and system for calculating phrase-document importance
    • 计算短语文件重要性的方法和系统
    • US06549897B1
    • 2003-04-15
    • US09215513
    • 1998-12-17
    • Sanjeev KatariyaWilliam P. Jones
    • Sanjeev KatariyaWilliam P. Jones
    • G06F1730
    • G06F17/30616Y10S707/99933Y10S707/99935Y10S707/99936
    • A method and system for generating a weight for phrases within each document in a collection of documents. Each document has terms such as words and numbers. Each phrase comprises component terms. Each term frequency represents the number of occurrences of a term in a document, and the phrase frequency represents the number of occurrences of a phrase in a document. To generate the weight, the weighting system first estimates a document frequency for the phrase by multiplying an estimated phrase probability of the phrase times the number of documents that contain each component term. The estimated phrase probability is an estimation of the probability that any phrase in documents that contain each component term is the phrase whose weight is to be estimated. The document frequency is the number of the documents that contain the phrase. The weighting system then estimates a total phrase frequency for the phrase as the average phrase frequency for the phrase times the estimated document frequency for the phrase. The weighting system derives the average phrase frequency from the phrase probability of the phrase and average number of terms per document. The weighting system then combines the estimated document frequency with the estimated total phrase frequency to generate the weight of the phrase.
    • 一种方法和系统,用于在文档集合中生成每个文档内的短语的权重。 每个文档都有字词和数字等术语。 每个短语包括组件术语。 每个术语频率表示文档中术语的出现次数,而短语频率表示文档中词组的出现次数。 为了产生权重,加权系统首先通过将短语的估计短语概率乘以包含每个分量项的文档的数量来估计短语的文档频率。 估计的短语概率是对包含每个组成部分术语的文档中的任何短语是其重量将被估计的短语的概率的估计。 文档频率是包含该短语的文档的编号。 加权系统然后将短语的总短语频率估计为该短语的平均短语频率乘以该短语的估计文档频率。 加权系统从短语的概率和每个文档的平均术语数量得出平均短语频率。 加权系统然后将估计的文档频率与估计的总短语频率组合以生成短语的权重。
    • 2. 发明授权
    • Method and system for calculating term-document importance
    • 计算术语文件重要性的方法和系统
    • US06473753B1
    • 2002-10-29
    • US09216085
    • 1998-12-18
    • Sanjeev KatariyaWilliam P. Jones
    • Sanjeev KatariyaWilliam P. Jones
    • G06F1730
    • G06F17/277G06F17/2785Y10S707/99934Y10S707/99935
    • A weighting system for calculating the term-document importance for each term within each document that is part of a collection of documents (i.e., a corpus). The weighting system calculates the importance of a term within a document based on a computed normalized term frequency and a computed inverse document frequency. The computed normalized term frequency is a function, referred to as the “computed term frequency function” (“A”), of a normalized term frequency. The normalized term frequency is the term frequency, which is the number of times that the term occurs in the document, normalized by the total term frequency of the term within all documents, which is the total number of times that the term occurs in all the documents. The weighting system normalizes the term frequency by dividing the term frequency by a function, referred to as the “normalizing term frequency function” (“&Ggr;”), of the total term frequency. The computed inverse document frequency is a function, referred to as the “computed inverse document frequency function” (“B”) of the inverse document frequency. The weighting system identifies a computed normalized term frequency function A and a computed inverse document frequency function B so that on average the computed normalized term frequency and the computed inverse document frequency contribute equally to the weight of the terms.
    • 用于计算作为文档集合(即,语料库)的一部分的每个文档内每个术语的术语文档重要性的加权系统。 加权系统基于计算的归一化术语频率和计算的逆文档频率来计算文档内的术语的重要性。 计算的归一化术语频率是归一化术语频率的函数,称为“计算的术语频率函数”(“A”)。 归一化的术语频率是术语频率,它是文档中出现的次数,由所有文档中的术语的总术语频率归一化,这是术语在所有文档中发生的总次数 文件。 加权系统通过将术语频率除以总术语频率的称为“归一化术语频率函数”(“&Ggr”)的函数来对术语频率进行归一化。 计算的逆文档频率是一个函数,称为逆文档频率的“计算逆文档频率函数”(“B”)。 加权系统识别计算的归一化项频率函数A和计算的逆文档频率函数B,使得平均计算的归一化项频率和计算的逆文档频率对于项的权重平均地贡献。
    • 3. 发明授权
    • Creating a summary having sentences with the highest weight, and lowest length
    • 创建具有最高权重和最小长度的句子的摘要
    • US06789230B2
    • 2004-09-07
    • US09216097
    • 1998-12-18
    • Sanjeev KatariyaWilliam P. Jones
    • Sanjeev KatariyaWilliam P. Jones
    • G06F700
    • G06F17/27
    • A method and system for generating a summary of a document. The summary generating system generates the summary from the sentences that form the document. The summary generating system calculates a weight for each of the sentences in the document. The weight indicates the importance of the sentence to the document. The summary generating system then selects sentences based on their calculated weights. The summary generating system creates a summary of the selected sentences such that selected sentences are ordered in the created summary in the same relative order as in the document. In one embodiment, the summary generating system identifies sets of sentences whose total length of the sentences in the set is less than a maximum length. The summary generating system then selects an identified set of sentences whose total of the calculated weights of the sentences is greatest as the generated summary. The length of a sentence may be measured in characters or words. In an alternate embodiment, the summary generating system selects the sentences with the highest calculated weights whose total length of the selected sentences is less than a maximum length as the summary.
    • 一种用于生成文档摘要的方法和系统。 摘要生成系统从形成文档的句子中生成摘要。 汇总生成系统计算文档中每个句子的权重。 权重表示句子对文档的重要性。 总结生成系统然后根据它们的计算权重选择句子。 摘要生成系统创建所选句子的摘要,使得所选择的句子以与文档中相同的相对顺序在创建的摘要中排序。 在一个实施例中,汇总生成系统识别集合中的句子的总长度小于最大长度的句子集合。 总结生成系统然后选择所确定的一组句子,其中所计算的句子的总重量最大,作为生成的摘要。 句子的长度可以用字母或单词来衡量。 在替代实施例中,摘要生成系统选择具有最高计算权重的句子,其总选择句子的总长度小于最大长度作为总结。
    • 5. 发明申请
    • Componentized slot-filling architecture
    • 组件化插槽填充架构
    • US20070094185A1
    • 2007-04-26
    • US11246847
    • 2005-10-07
    • William RamseyJianfeng GaoSanjeev Katariya
    • William RamseyJianfeng GaoSanjeev Katariya
    • G06N5/00
    • G06F17/2785
    • The subject disclosure pertains to systems and methods for performing natural language processing in which tokens are mapped to task slots. The system includes a mapper component that generates a lattice representing possible interpretations of the tokens, a decoder component that creates a ranked list of paths traversing the lattice, a scorer component that generates scores used to rank paths and post-processing components that format the paths for use by other software. Each of these components may be independent, such that the component may be modified or replaced without affecting the remaining components. This allows a variety of different mathematical models and algorithms to be tested or deployed without requiring changes to the remainder of the system.
    • 本发明涉及用于执行令牌被映射到任务时隙的自然语言处理的系统和方法。 该系统包括生成表示可能的令牌解释的格的映射器组件,创建遍历格子的路径的排序列表的解码器组件,产生用于对路径进行排序的得分器组件,以及后处理格式化路径的组件 供其他软件使用。 这些组件中的每一个可以是独立的,使得可以修改或替换组件而不影响剩余组件。 这允许测试或部署各种不同的数学模型和算法,而不需要更改系统的其余部分。
    • 7. 发明申请
    • Adaptive customer assistance system for software products
    • 适用于软件产品的客户辅助系统
    • US20060265232A1
    • 2006-11-23
    • US11133549
    • 2005-05-20
    • Sanjeev KatariyaHsiao-Wuen Hon
    • Sanjeev KatariyaHsiao-Wuen Hon
    • G06Q99/00
    • G06Q30/016G06F9/453G06Q30/02G06Q30/0281
    • An adaptive customer assistance system that can serve as an integrated online and offline help platform for a suite of software products is provided. The assistance system includes a customer-interaction interface and a data management component and a download management component for distributed customer interaction. The data management component includes an authoring component, a download component, a runtime component and an analysis component. The runtime component, which includes a customer assistance model, is configured to receive a user-formulated question from the customer-interaction interface. The runtime component provides an answer to the user-formulated question based on information included in the customer assistance model. The analysis component automatically analyzes, in substantially real-time, the user-formulated question and the corresponding answer, and provides an analysis output for use in improving a quality of customer assistance.
    • 提供了一个自适应客户支持系统,可以作为一套软件产品的集成在线和离线帮助平台。 辅助系统包括客户交互界面和数据管理组件以及用于分布式客户交互的下载管理组件。 数据管理组件包括创作组件,下载组件,运行时组件和分析组件。 包括客户辅助模型的运行时组件被配置为从客户交互界面接收用户制定的问题。 运行时组件根据客户辅助模型中包含的信息提供了用户提出的问题的答案。 分析组件基本上实时地分析用户制定的问题和相应的答案,并提供用于提高客户协助质量的分析输出。
    • 9. 发明授权
    • Data cache using plural lists to indicate sequence of data storage
    • 使用多个列表的数据高速缓存来指示数据存储的顺序
    • US06449695B1
    • 2002-09-10
    • US09321301
    • 1999-05-27
    • Alexandre BereznyiSanjeev Katariya
    • Alexandre BereznyiSanjeev Katariya
    • G06F1212
    • G06F17/30902G06F12/123
    • A cache system controls the insertion and deletion of data items using a plurality of utilization lists. When a data item is stored within the data cache, a corresponding data pointer, or other indicator, is stored within the utilization list in a manner indicative of the sequence in which data items were stored in the data cache. When a data item is subsequently retrieved from the data cache, the corresponding data pointer may be altered or moved to indicate that the data item has recently been retrieved. The data pointers corresponding to data items that have never been retrieved will indicate the sequence with which the data items were stored in the cache such that data items may be identified as least recently used (LRU) data items. The data pointers corresponding to data items that have been retrieved provide an indication of the sequence with which the data items have been retrieved such that the most recently retrieved data item is considered the most recently used (MRU) data item. The system controls the deletion of data items from the cache by deleting the LRU data items. A large number of utilization lists may operate independently to accommodate a large number of users. An entry pointer selects one of the utilization lists to store the data pointer corresponding to a data item stored within the cache. A deletion pointer selects one of the utilization lists. The system deletes the LRU data item based on the utilization list currently selected by the deletion pointer.
    • 高速缓存系统使用多个利用列表来控制数据项的插入和删除。 当数据项存储在数据高速缓存中时,相应的数据指针或其他指示符以指示数据项存储在数据高速缓存中的顺序的方式存储在利用列表内。 当随后从数据高速缓存中检索数据项时,相应的数据指针可以被改变或移动,以指示最近已经检索出数据项。 对应于从未被检索到的数据项的数据指针将指示数据项被存储在高速缓存中的顺序,使得数据项可被识别为最近最少使用(LRU)数据项。 对应于已被检索的数据项的数据指针提供数据项被检索的序列的指示,使得最近检索的数据项被认为是最近使用的(MRU)数据项。 系统通过删除LRU数据项来控制从缓存中删除数据项。 大量的利用列表可以独立地操作以容纳大量的用户。 条目指针选择一个使用列表来存储对应于存储在高速缓存中的数据项的数据指针。 删除指针选择一个利用率列表。 系统根据删除指针当前选择的利用率列表删除LRU数据项。
    • 10. 发明授权
    • Automatically matching data sets with storage components
    • 自动将数据集与存储组件进行匹配
    • US08949293B2
    • 2015-02-03
    • US12972137
    • 2010-12-17
    • Magdi A. MorsiWai Ho AuYing SunSanjeev KatariyaYang XuNina Sarawgi
    • Magdi A. MorsiWai Ho AuYing SunSanjeev KatariyaYang XuNina Sarawgi
    • G06F17/30G06F11/34
    • G06F17/30289G06F11/3442G06F11/3485
    • An administrator of an enterprise storage set may be tasked with storing a large number and variety of data sets on a large number and variety of storage components. However, the manual selection of a physical schema by an administrator may be time-consuming, may generate inefficient physical schemata, and may not be easily reevaluated as the data sets and storage set change. Presented herein are techniques for automatically determining a physical schema by comparing the storage factors of each data set (e.g., data size, relationships with other data sets, and usages of the data set by users) with the storage capabilities of the storage components, selecting a suitable storage component, and implementing the storage of the data set on the storage component. An embodiment of these techniques may thereby achieve an automated identification of a physical schema with improved efficiency and flexibility of the physical schema while conserving administrative resources.
    • 企业存储集的管理员可以负责在大量和多种存储组件上存储大量和多种数据集。 然而,由管理员手动选择物理模式可能是耗时的,可能产生低效的物理模式,并且可能不会随数据集和存储集改变而容易地重新评估。 这里提出的技术是通过将每个数据集的存储因子(例如,数据大小,与其他数据集的关系以及用户的数据集的使用)与存储组件的存储能力进行比较来自动确定物理模式,选择 合适的存储组件,以及在存储组件上实现数据集的存储。 因此,这些技术的实施例可以实现物理模式的自动识别,同时节省管理资源,同时提高物理模式的效率和灵活性。