会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 9. 发明授权
    • Generation of min-hash signatures
    • 生成最小哈希签名
    • US08447032B1
    • 2013-05-21
    • US12042138
    • 2008-03-04
    • Michele CovellSergey IoffeShumeet Baluja
    • Michele CovellSergey IoffeShumeet Baluja
    • H04L9/00
    • G06F17/30949
    • A computer-implemented method is disclosed for generating a signature representing an input bit vector. A signature generator generates a primary min-hash value based on a primary permutation from a sequence of permutation blocks. If the primary min-hash value is lower than a threshold value, a secondary min-hash value is generated based on a secondary permutation from the same permutation block. The signature generator then determines one or more signature values based on the primary min-hash value, the secondary min-hash value or both. The one or more signature values are stored as elements of the signature.
    • 公开了一种用于生成表示输入比特向量的签名的计算机实现的方法。 签名生成器基于来自排列块序列的主置换生成主最小哈希值。 如果主要最小哈希值低于阈值,则基于来自相同置换块的二次置换生成二次最小哈希值。 签名生成器然后基于主要最小哈希值,二次最小哈希值或两者来确定一个或多个签名值。 一个或多个签名值被存储为签名的元素。
    • 10. 发明授权
    • Hashing techniques for data set similarity determination
    • 数据集相似性确定的哈希技术
    • US09311403B1
    • 2016-04-12
    • US13162061
    • 2011-06-16
    • Sergey Ioffe
    • Sergey Ioffe
    • G06F17/30
    • G06F17/30864G06F17/30247G06F17/30256
    • Methods, systems and computer program product embodiments for hashing techniques for determining similarity between data sets are described herein. A method embodiment includes, initializing a random number generator with a weighted min-hash value as a seed, wherein the weighted min-hash value approximates a similarity distance between data sets. A number of bits in the weighted min-hash value is determined by uniformly sampling an integer bit value using the random number generator. A system embodiment includes a repository configured to store a plurality of data sets and a hash generator configured to generate weighted min-hash values from the data sets. The system further includes a similarity determiner configured to determine a similarity between the data sets.
    • 本文描述了用于确定数据集之间的相似性的散列技术的方法,系统和计算机程序产品实施例。 方法实施例包括:初始化具有加权最小哈希值作为种子的随机数发生器,其中加权最小哈希值近似数据集之间的相似距离。 通过使用随机数发生器对整数位值进行均匀采样来确定加权最小哈希值中的多个位。 系统实施例包括被配置为存储多个数据集的存储库和被配置为从数据集生成加权最小散列值的散列生成器。 该系统还包括被配置为确定数据集之间的相似性的相似性确定器。