会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Systems and methods for using anchor text as parallel corpora for cross-language information retrieval
    • 使用锚文本作为跨语言信息检索的并行语料库的系统和方法
    • US07146358B1
    • 2006-12-05
    • US09939661
    • 2001-08-28
    • Luis GravanoMonika H. Henzinger
    • Luis GravanoMonika H. Henzinger
    • G06F17/30G06F7/00
    • G06F17/30864Y10S707/99934Y10S707/99935
    • A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by: (1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language; (2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or (3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language. The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.
    • 系统执行跨语言查询翻译。 系统接收包括第一语言的搜索查询,并确定搜索查询的条款可能的翻译成第二语言。 该系统还将用作并行语料库的文档定位为通过以下方式帮助翻译:(1)以包含与搜索查询的条款匹配的引用的第一语言定位文档,并识别第二语言的文档; (2)以包含与查询条款相匹配的引用的第一语言定位文件,并引用第一语言的其他文档,并且识别包含对其他文档的引用的第二语言的文档; 或者(3)以符合查询条款的第一语言定位文档,并且识别第二语言中包含对第一语言文档的引用的文档。 系统可以使用第二语言文档作为并行语料库来消除搜索查询的术语的可能的翻译之间的歧义,并将可能的翻译之一识别为搜索查询到第二语言的可能的翻译。
    • 3. 发明授权
    • Systems and methods for using anchor text as parallel corpora for cross-language information retrieval
    • 使用锚文本作为跨语言信息检索的并行语料库的系统和方法
    • US08190608B1
    • 2012-05-29
    • US13174209
    • 2011-06-30
    • Luis GravanoMonika H. Henzinger
    • Luis GravanoMonika H. Henzinger
    • G06F17/30
    • G06F17/30864Y10S707/99934Y10S707/99935
    • A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by: (1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language; (2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or (3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language. The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.
    • 系统执行跨语言查询翻译。 该系统接收包括第一语言的搜索查询,并确定搜索查询的术语可能的翻译成第二语言。 该系统还将用作并行语料库的文档定位为通过以下方式帮助翻译:(1)以包含与搜索查询的条款匹配的引用的第一语言定位文档,并识别第二语言的文档; (2)以包含与查询条款相匹配的引用的第一语言定位文件,并引用第一语言的其他文档,并且识别包含对其他文档的引用的第二语言的文档; 或者(3)以符合查询条款的第一语言定位文档,并且识别第二语言中包含对第一语言文档的引用的文档。 系统可以使用第二语言文档作为并行语料库来消除搜索查询的术语的可能的翻译之间的歧义,并将可能的翻译之一识别为搜索查询到第二语言的可能的翻译。
    • 4. 发明授权
    • Systems and methods for using anchor text as parallel corpora for cross-language information retrieval
    • 使用锚文本作为跨语言信息检索的并行语料库的系统和方法
    • US07996402B1
    • 2011-08-09
    • US12872755
    • 2010-08-31
    • Luis GravanoMonika H. Henzinger
    • Luis GravanoMonika H. Henzinger
    • G06F17/30
    • G06F17/30864Y10S707/99934Y10S707/99935
    • A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by: (1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language; (2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or (3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language. The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.
    • 系统执行跨语言查询翻译。 系统接收包括第一语言的搜索查询,并确定搜索查询的条款可能的翻译成第二语言。 该系统还将用作并行语料库的文档定位为通过以下方式帮助翻译:(1)以包含与搜索查询的条款匹配的引用的第一语言定位文档,并识别第二语言的文档; (2)以包含与查询条款相匹配的引用的第一语言定位文件,并引用第一语言的其他文档,并且识别包含对其他文档的引用的第二语言的文档; 或者(3)以符合查询条款的第一语言定位文档,并且识别第二语言中包含对第一语言文档的引用的文档。 系统可以使用第二语言文档作为并行语料库来消除搜索查询的术语的可能的翻译之间的歧义,并将可能的翻译之一识别为搜索查询到第二语言的可能的翻译。
    • 5. 发明授权
    • Systems and methods for using anchor text as parallel corpora for cross-language information retrieval
    • 使用锚文本作为跨语言信息检索的并行语料库的系统和方法
    • US07814103B1
    • 2010-10-12
    • US11468674
    • 2006-08-30
    • Luis GravanoMonika H. Henzinger
    • Luis GravanoMonika H. Henzinger
    • G06F17/30
    • G06F17/30864Y10S707/99934Y10S707/99935
    • A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by: (1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language; (2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or (3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language. The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.
    • 系统执行跨语言查询翻译。 系统接收包括第一语言的搜索查询,并确定搜索查询的条款可能的翻译成第二语言。 该系统还将用作并行语料库的文档定位为通过以下方式帮助翻译:(1)以包含与搜索查询的条款匹配的引用的第一语言定位文档,并识别第二语言的文档; (2)以包含与查询条款相匹配的引用的第一语言定位文件,并引用第一语言的其他文档,并且识别包含对其他文档的引用的第二语言的文档; 或者(3)以符合查询条款的第一语言定位文档,并且识别第二语言中包含对第一语言文档的引用的文档。 系统可以使用第二语言文档作为并行语料库来消除搜索查询的术语的可能的翻译之间的歧义,并将可能的翻译之一识别为搜索查询到第二语言的可能的翻译。
    • 7. 发明申请
    • Detecting Duplicate and Near-Duplicate Files
    • 检测重复和近重复文件
    • US20120290597A1
    • 2012-11-15
    • US13225342
    • 2011-09-02
    • Monika H. Henzinger
    • Monika H. Henzinger
    • G06F17/30
    • G06F17/2211G06F16/958
    • Near-duplicate documents may be identified by (a) accepting a set of documents, (b) processing the set of documents to determine a first set of near-duplicate documents using a first document similarity technique, and (c) processing the first set of near duplicate documents to determine a second set of near-duplicate documents using a second document similarity technique. The first document similarity technique might be token order dependent, and the second document similarity technique might be order independent. The first document similarity technique might be token frequency independent, and the second document similarity technique might be frequency dependent. The first document similarity technique might determine whether two documents are near-duplicates using representations based on a subset of the words or tokens of the documents, and the second document similarity technique might determine whether two documents are near-duplicates using representations based on all of the words or tokens of the documents. The first document similarity technique might use set intersection to determine whether or not documents are near-duplicates, and the second document similarity technique might use random projections to determine whether or not documents are near-duplicates.
    • 可以通过以下方式来识别近似重复的文档:(a)接收一组文档,(b)使用第一文档相似性技术来处理所述一组文档以确定第一组近似重复的文档,以及(c)处理所述第一组 使用第二文档相似性技术来确定第二组近似重复的文档。 第一个文档相似性技术可能是令牌顺序相关的,第二个文档相似性技术可能是独立的。 第一个文档相似性技术可能是令牌频率无关的,第二个文档相似性技术可能是频率依赖的。 第一文档相似性技术可以基于文档的单词或令牌的子集来确定两个文档是否是近似重复的,并且第二文档相似性技术可以基于所有文档的表示来确定两个文档是否是近似重复的 文件的单词或令牌。 第一种文档相似性技术可能使用集合交集来确定文档是否是近似重复的,并且第二文档相似性技术可以使用随机投影来确定文档是否是重复的。