会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Measuring confidence of file clustering and clustering based file classification
    • 测量文件聚类和基于聚类的文件分类的置信度
    • US08214365B1
    • 2012-07-03
    • US13036864
    • 2011-02-28
    • Pratyusa Kumar ManadhataSandeep B. BhatkarKent E. Griffin
    • Pratyusa Kumar ManadhataSandeep B. BhatkarKent E. Griffin
    • G06F7/00
    • G06F21/566G06F21/577G06K9/6272G06K9/723
    • A uniformity of a cluster of samples is determined, and a corresponding raw confidence value is calculated. A confidence interval weight is calculated using a confidence interval to determine reliability of the uniformity. A trace length weight is calculated, as a function of traces of the samples. An n-gram weight is calculated, as a function of numbers of n-grams generated by the samples. A compactness weight is calculated, as a function of the similarity of the samples. A cluster weight is calculated as a function of the four above-described weights. A cluster confidence measurement is calculated as a function of the cluster weight and the raw confidence value. When a new sample is assigned to the cluster, an assignment confidence measurement is calculated, as a function of the cluster's confidence measurement and the sample's trace length, n-grams and similarity.
    • 确定样本簇的均匀性,并计算相应的原始置信度值。 使用置信区间来计算置信区间权重,以确定均匀性的可靠性。 计算痕量长度权重,作为样本痕迹的函数。 根据样品产生的n克数,计算出n克重量。 计算紧凑度权重,作为样本相似度的函数。 根据上述四个权重的函数计算簇权重。 根据簇权重和原始置信度值计算簇置信度测量。 当将新样本分配给群集时,根据群集的置信度测量和样本的踪迹长度,n-gram和相似度计算分配置信度测量值。
    • 3. 发明授权
    • Encoding machine code instructions for static feature based malware clustering
    • 编码基于静态功能的恶意软件集群的机器代码说明
    • US08826439B1
    • 2014-09-02
    • US13014552
    • 2011-01-26
    • Xin HuKent E. GriffinSandeep B. Bhatkar
    • Xin HuKent E. GriffinSandeep B. Bhatkar
    • G06F11/00G06F21/56
    • G06F21/56G06F21/563
    • Machine language instruction sequences of computer files are extracted and encoded into standardized opcode sequences. The standardized opcodes in the sequences are of the same length and do not include operands. A multi-dimension vector is generated as a static feature for each computer file, where each element in the vector corresponds to the number of occurrences of a unique N-gram (i.e., unique sequence of N consecutive standardized opcodes) in the standardized opcode sequence for that computer file. The computer files are clustered into clusters of similarly classified files based on similarities of their static features. An unknown computer file can be classified by first grouping the file into a cluster of files with similar static features (e.g., into the cluster with the shortest average distance), and then determining the classification of that file based on the classifications of other files that belong to the same cluster.
    • 计算机文件的机器语言指令序列被提取并编码成标准化的操作码序列。 序列中的标准化操作码具有相同的长度,不包括操作数。 生成多维向量作为每个计算机文件的静态特征,其中向量中的每个元素对应于标准化操作码序列中唯一N-gram(即,N个连续标准化操作码的唯一序列)的出现次数 为该计算机文件。 基于其静态特征的相似性,将计算机文件聚类成类似分类文件的群集。 可以通过首先将文件分组成具有相似静态特征的文件集(例如,到具有最短平均距离的集群),然后基于其他文件的分类来确定该文件的分类,来分类未知的计算机文件 属于同一个集群。