会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • DATA PROFILE COMPUTATION
    • 数据配置文件计算
    • US20090006392A1
    • 2009-01-01
    • US11769050
    • 2007-06-27
    • Zhimin ChenVenkatesh GantiGunjan JhaShriraghav KaushikVivek Narasayya
    • Zhimin ChenVenkatesh GantiGunjan JhaShriraghav KaushikVivek Narasayya
    • G06F7/06G06F17/30
    • G06F17/30536
    • Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values.
    • 提供采用关键轮廓计算和数据模式轮廓计算的数据轮廓计算技术的架构。 数据表中的关键轮廓计算包括精密键和近似键,并且基于关键优点。 100%的关键优势是一个确切的关​​键,其中一个关键的任何其他百分比。 基于具有重复的属性值的表行的数量来估计关键强度。 只返回超过阈值的列集。 模式分析标识一组最佳描述一组给定属性值中的模式的正则表达式模式。 模式分析包括三个阶段:用于确定令牌正则表达式的第一阶段,用于确定候选正则表达式的第二阶段,以及用于识别与属性值匹配的候选的最佳正则表达式的第三阶段。
    • 2. 发明申请
    • Efficient computation of multiple group by queries
    • 通过查询高效计算多组
    • US20060253422A1
    • 2006-11-09
    • US11124516
    • 2005-05-06
    • Vivek NarasayyaZhimin Chen
    • Vivek NarasayyaZhimin Chen
    • G06F17/30
    • G06F16/24535
    • Systems and methodologies for computation of multiple group by queries via an optimizer that examines the space of plans in a systematic and cost based manner. The optimizer includes a merging component to merge pairs of sub plans to facilitate a plan choice with a lowest cost. The merging component can take as input two sub plans (e.g., sub plan P1 with root node V1 and sub plan P2 with root node V2, wherein each sub plan is a sub-tree of a logical plan whose root node is directly pointed to a Relation “R”), to return a set of sub-plans as out put with a root node V1∪V2 that is the smallest relation from which both V1 and V2 can be computed.
    • 用于通过查询计算多组的系统和方法,该优化器以系统和成本为基础的方式检查计划的空间。 优化器包括合并组件以合并子计划对,以便以最低成本进行计划选择。 合并组件可以将根节点V <1>和子计划P <2> 的子计划(例如,子计划P&lt; 1&lt; 1&gt; 节点V 2,其中每个子计划是逻辑计划的子树,其根节点直接指向关系“R”),以返回一组子计划,如与 作为V SUB 1和V 2 2两者之间的最小关系的根节点V 1 2 V 2 2&lt; 1&lt; 1&lt; 计算。
    • 4. 发明授权
    • Key profile computation and data pattern profile computation
    • 关键轮廓计算和数据模式轮廓计算
    • US07720883B2
    • 2010-05-18
    • US11769050
    • 2007-06-27
    • Zhimin ChenVenkatesh GantiGunjan JhaShriraghav KaushikVivek Narasayya
    • Zhimin ChenVenkatesh GantiGunjan JhaShriraghav KaushikVivek Narasayya
    • G06F7/00G06F17/30
    • G06F17/30536
    • Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values.
    • 提供采用关键轮廓计算和数据模式轮廓计算的数据轮廓计算技术的架构。 数据表中的关键轮廓计算包括精密键和近似键,并且基于关键优点。 100%的关键优势是一个确切的关​​键,其中一个关键的任何其他百分比。 基于具有重复的属性值的表行的数量来估计关键强度。 只返回超过阈值的列集。 模式分析标识一组最佳描述一组给定属性值中的模式的正则表达式模式。 模式分析包括三个阶段:用于确定令牌正则表达式的第一阶段,用于确定候选正则表达式的第二阶段,以及用于识别与属性值匹配的候选的最佳正则表达式的第三阶段。
    • 5. 发明授权
    • High precision set expansion for large concepts
    • 高精度集扩展为大概念
    • US09547718B2
    • 2017-01-17
    • US13325072
    • 2011-12-14
    • Jiewen HuangZhimin ChenArvind ArasuVivek Narasayya
    • Jiewen HuangZhimin ChenArvind ArasuVivek Narasayya
    • G06F17/30
    • G06F17/30867G06Q30/0201
    • A set expansion system is described herein that improves precision, recall, and performance of prior set expansion methods for large sets of data. The system maintains high precision and recall by 1) identifying the qualify of particular lists and applying that quality through a weight, 2) allowing for the specification or negative examples in a set of seeds to reduce the introduction of bad entities into the set, and 3) applying a cutoff to eliminate lists that include a low number of positive matches. The system may perform multiple passes to first generate a good candidate result set and then refine the set to find a set with highest quality. The system may also apply Map Reduce or other distributed processing techniques to allow calculation in parallel. Thus, the system efficiently expands large concept sets from a potentially small set of initial seeds from readily available web data.
    • 本文描述了一种扩展系统,可提高大型数据集的先前设置扩展方法的精度,调用和性能。 该系统通过1)确定特定列表的资格并通过权重来应用该质量,保持高精度和召回; 2)允许一组种子中的规范或否定示例,以减少将不良实体引入到集合中; 3)应用截止值来消除包括少量正匹配的列表。 系统可以执行多次通过以首先产生良好的候选结果集合,然后对该集合进行优化以找到具有最高质量的集合。 该系统还可以应用Map Reduce或其他分布式处理技术来并行计算。 因此,系统从容易获得的网络数据的一小部分初始种子中有效地扩展了大概念集。
    • 6. 发明授权
    • Dictionary for hierarchical attributes from catalog items
    • 目录项目的层次属性字典
    • US08606788B2
    • 2013-12-10
    • US13160532
    • 2011-06-15
    • Zhimin ChenEduardo LaureanoRenfei LuoTsheko MutunguVivek NarasayyaDavid Talby
    • Zhimin ChenEduardo LaureanoRenfei LuoTsheko MutunguVivek NarasayyaDavid Talby
    • G06F17/30
    • G06F17/30616
    • A plurality of items included in a catalog may be obtained, each item associated with an item category. Brand indicators may be obtained, each brand indicator associated with the item category. Brand indicators associated with each of the items may be determined, and the each item may be assigned to a partition group associated with the brand indicator that is associated with the each item. Correlated string tokens that are correlated, greater than a predetermined correlation threshold value, with the brand indicator associated with the partition group that is associated with the each one of the items, the correlated string tokens associated with the each one of the plurality of items, may be determined. A dictionary hierarchy may be generated based on the one or more correlated string tokens.
    • 可以获得包括在目录中的多个项目,每个项目与项目类别相关联。 可以获得品牌指标,每个品牌指标与项目类别相关联。 可以确定与每个项目相关联的品牌指示符,并且可以将每个项目分配给与与每个项目相关联的品牌指示符相关联的分区组。 与相关联的字符串令牌,大于预定的相关阈值,与与与每个项目相关联的分区组相关联的品牌指示符,与多个项目中的每一个相关联的相关联的字符串令牌, 可以确定。 可以基于一个或多个相关串令牌来生成词典层次。
    • 7. 发明申请
    • HIGH PRECISION SET EXPANSION FOR LARGE CONCEPTS
    • 高精度扩展大概念
    • US20130159317A1
    • 2013-06-20
    • US13325072
    • 2011-12-14
    • Jiewen HuangZhimin ChenArvind ArasuVivek Narasayya
    • Jiewen HuangZhimin ChenArvind ArasuVivek Narasayya
    • G06F17/30
    • G06F17/30867G06Q30/0201
    • A set expansion system is described herein that improves precision, recall, and performance of prior set expansion methods for large sets of data. The system maintains high precision and recall by 1) identifying the qualify of particular lists and applying that quality through a weight, 2) allowing for the specification or negative examples in a set of seeds to reduce the introduction of bad entities into the set, and 3) applying a cutoff to eliminate lists that include a low number of positive matches. The system may perform multiple passes to first generate a good candidate result set and then refine the set to find a set with highest quality. The system may also apply Map Reduce or other distributed processing techniques to allow calculation in parallel. Thus, the system efficiently expands large concept sets from a potentially small set of initial seeds from readily available web data.
    • 本文描述了一种扩展系统,可提高大型数据集的先前设置扩展方法的精度,调用和性能。 该系统通过1)确定特定列表的资格并通过权重来应用该质量,保持高精度和召回; 2)允许一组种子中的规范或否定示例,以减少将不良实体引入到集合中; 3)应用截止值来消除包括少量正匹配的列表。 系统可以执行多次通过以首先产生良好的候选结果集合,然后对该集合进行优化以找到具有最高质量的集合。 该系统还可以应用Map Reduce或其他分布式处理技术来并行计算。 因此,系统从容易获得的网络数据的一小部分初始种子中有效地扩展了大概念集。