基本信息:
- 专利标题: METHOD AND APPARATUS FOR SELECTING CLUSTERINGS TO CLASSIFY A DATA SET
- 专利标题(中):用于选择聚类来分类数据集的方法和设备
- 申请号:PCT/US2012022178 申请日:2012-01-23
- 公开(公告)号:WO2012102990A3 公开(公告)日:2012-10-04
- 发明人: KING GARY , GRIMMER JUSTIN
- 申请人: HARVARD COLLEGE , KING GARY , GRIMMER JUSTIN
- 专利权人: HARVARD COLLEGE,KING GARY,GRIMMER JUSTIN
- 当前专利权人: HARVARD COLLEGE,KING GARY,GRIMMER JUSTIN
- 优先权: US201161436037 2011-01-25
- 主分类号: G06F17/00
- IPC分类号: G06F17/00 ; G06F3/14 ; G06F17/40
摘要:
In a computer assisted clustering method, a clustering space is generated from fixed basis partitions that embed the entire space of all possible clusterings. A lower dimensional clustering space is first created from the space of all possible clusterings by isometrically embedding the space of all possible clusterings in a lower dimensional Euclidean space. This lower dimensional space is then sampled based on the number of documents in the corpus. Partitions are then developed based on the samples that tessellate the space. Finally, using clusterings representative of these tessellations, a two-dimensional representation for users to explore is created.
摘要(中):
在计算机辅助聚类方法中,从固定基础分区生成聚类空间,嵌入所有可能聚类的整个空间。 首先从所有可能聚类的空间中通过等距嵌入所有可能聚类的空间到较低维的欧几里德空间来创建较低维聚类空间。 然后根据文集中文档的数量对这个较低维空间进行采样。 然后基于镶嵌空间的样本开发分区。 最后,使用代表这些细分的聚类,创建用户探索的二维表示。