专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07275029B1 System and method for joint optimization of language model performance and size 有权
标题翻译：联合优化语言模型性能和尺寸的系统和方法
公开(公告)号：US07275029B1
公开(公告)日：2007-09-25
申请号：US09607786
申请日：2000-06-30
申请人： Jianfeng Gao , Kai-Fu Lee , Mingjing Li , Hai-Feng Wang , Dong-Feng Cai , Lee-Feng Chien
发明人： Jianfeng Gao , Kai-Fu Lee , Mingjing Li , Hai-Feng Wang , Dong-Feng Cai , Lee-Feng Chien
IPC分类号： G06F17/27
CPC分类号： G06F17/2735 , G06F17/274 , G06F17/2818
摘要： A method for the joint optimization of language model performance and size is presented comprising developing a language model from a tuning set of information, segmenting at least a subset of a received textual corpus and calculating a perplexity value for each segment and refining the language model with one or more segments of the received corpus based, at least in part, on the calculated perplexity value for the one or more segments.
摘要翻译：提出了一种用于联合优化语言模型性能和大小的方法，包括从调整的信息集开发语言模型，分割所接收的文本语料库的至少一个子集，并计算每个分段的困惑度值，并用至少部分地基于所计算的一个或多个段的困惑度值，所接收的语料库的一个或多个段。

2. 发明授权

US06904402B1 System and iterative method for lexicon, segmentation and language model joint optimization 有权
标题翻译：词法，分割和语言模型联合优化的系统迭代法
公开(公告)号：US06904402B1
公开(公告)日：2005-06-07
申请号：US09609202
申请日：2000-06-30
申请人： Hai-Feng Wang , Chang-Ning Huang , Kai-Fu Lee , Shuo Di , Jianfeng Gao , Dong-Feng Cai , Lee-Feng Chien
发明人： Hai-Feng Wang , Chang-Ning Huang , Kai-Fu Lee , Shuo Di , Jianfeng Gao , Dong-Feng Cai , Lee-Feng Chien
IPC分类号： G06F17/28 , G06F17/27 , G10L15/06 , G10L15/18 , G06F17/21 , G06F17/20 , G10L15/00
CPC分类号： G06F17/274 , G10L15/197
摘要： A method for optimizing a language model is presented comprising developing an initial language model from a lexicon and segmentation derived from a received corpus using a maximum match technique, and iteratively refining the initial language model by dynamically updating the lexicon and re-segmenting the corpus according to statistical principles until a threshold of predictive capability is achieved.
摘要翻译：提出了一种用于优化语言模型的方法，其包括使用最大匹配技术从词典和从接收到的语料库导出的分割开发初始语言模型，并且通过动态地更新词典并重新分割语料库来迭代地改进初始语言模型统计原理，直到达到预测能力的阈值。

3. 发明授权

US07216066B2 Method and apparatus for generating and managing a language model data structure 失效
标题翻译：用于生成和管理语言模型数据结构的方法和装置
公开(公告)号：US07216066B2
公开(公告)日：2007-05-08
申请号：US11276292
申请日：2006-02-22
申请人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao
发明人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao
IPC分类号： G06F7/60 , G06F17/27 , G10L15/00 , G10L19/14
CPC分类号： G06F17/27 , G10L15/285
摘要： A method is presented comprising assigning each of a plurality of segments comprising a received corpus to a node in a data structure denoting dependencies between nodes, and calculating a transitional probability between each of the nodes in the data structure.
摘要翻译：提出了一种方法，包括将包括接收到的语料库的多个段中的每一个分配给表示节点之间的依赖关系的数据结构中的节点，以及计算数据结构中每个节点之间的过渡概率。

4. 发明授权

US07020587B1 Method and apparatus for generating and managing a language model data structure 失效
标题翻译：用于生成和管理语言模型数据结构的方法和装置
公开(公告)号：US07020587B1
公开(公告)日：2006-03-28
申请号：US09608526
申请日：2000-06-30
申请人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao
发明人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao
IPC分类号： G06F7/60
CPC分类号： G06F17/27 , G10L15/285
摘要： The generation and management of a language model data structure include assigning each segment of a received corpus to a node in a data structure that denotes dependencies between the respective nodes. A transitional probability between each of the nodes in the data structure is calculated. A frequency of occurrence is calculated for each item of the respective segments, and those nodes of the data structure associated with items that do not meet a minimum frequency of occurrence threshold are removed. The data structure may be managed across a system memory of a computer system and an extended memory of the computer system.
摘要翻译：语言模型数据结构的生成和管理包括将接收到的语料库的每个段分配给表示相应节点之间的依赖关系的数据结构中的节点。计算数据结构中每个节点之间的过渡概率。针对各段的每个项目计算出现频率，并且去除与不符合最小发生频率阈值的项目相关联的数据结构的那些节点。可以跨计算机系统的系统存储器和计算机系统的扩展存储器来管理数据结构。

5. 发明申请

US20060184341A1 Method and Apparatus for Generating and Managing a Language Model Data Structure 失效
标题翻译：用于生成和管理语言模型数据结构的方法和装置
公开(公告)号：US20060184341A1
公开(公告)日：2006-08-17
申请号：US11276292
申请日：2006-02-22
申请人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao
发明人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao
IPC分类号： G06F17/10
CPC分类号： G06F17/27 , G10L15/285
摘要： A method is presented comprising assigning each of a plurality of segments comprising a received corpus to a node in a data structure denoting dependencies between nodes, and calculating a transitional probability between each of the nodes in the data structure.
摘要翻译：提出了一种方法，包括将包括接收到的语料库的多个段中的每一个分配给表示节点之间的依赖关系的数据结构中的节点，以及计算数据结构中每个节点之间的过渡概率。

6. 发明授权

US06766320B1 Search engine with natural language-based robust parsing for user query and relevance feedback learning 有权
标题翻译：搜索引擎采用基于自然语言的强大解析，用于用户查询和相关性反馈学习
公开(公告)号：US06766320B1
公开(公告)日：2004-07-20
申请号：US09645806
申请日：2000-08-24
申请人： Hai-Feng Wang , Kai-Fu Lee , Qiang Yang
发明人： Hai-Feng Wang , Kai-Fu Lee , Qiang Yang
IPC分类号： G06F1730
CPC分类号： G06F17/30616 , G06F17/3043 , G06F2216/03 , Y10S707/99933 , Y10S707/99935
摘要： A search engine architecture is designed to handle a full range of user queries, from complex sentence-based queries to simple keyword searches. The search engine architecture includes a natural language parser that parses a user query and extracts syntactic and semantic information. The parser is robust in the sense that it not only returns fully-parsed results (e.g., a parse tree), but is also capable of returning partially-parsed fragments in those cases where more accurate or descriptive information in the user query is unavailable. A question matcher is employed to match the fully-parsed output and the partially-parsed fragments to a set of frequently asked questions (FAQs) stored in a database. The question matcher then correlates the questions with a group of possible answers arranged in standard templates that represent possible solutions to the user query. The search engine architecture also has a keyword searcher to locate other possible answers by searching on any keywords returned from the parser. The answers returned from the question matcher and the keyword searcher are presented to the user for confirmation as to which answer best represents the user's intentions when entering the initial search query. The search engine architecture logs the queries, the answers returned to the user, and the user's confirmation feedback in a log database. The search engine has a log analyzer to evaluate the log database to glean information that improves performance of the search engine over time by training the parser and the question matcher.
摘要翻译：搜索引擎架构旨在处理从复杂的基于句子的查询到简单关键词搜索的全面的用户查询。搜索引擎架构包括解析用户查询并提取句法和语义信息的自然语言解析器。解析器在其不仅返回完全解析的结果（例如解析树）的意义上是鲁棒的，而且还能够在用户查询中更准确或描述性的信息不可用的情况下返回部分解析的片段。使用问题匹配器将完全解析的输出和部分解析的片段与存储在数据库中的一组常见问题（FAQ）进行匹配。然后，问题匹配器将问题与标准模板中排列的一组可能的答案相关联，这些答案代表用户查询的可能解决方案。搜索引擎架构还具有一个关键字搜索器，通过搜索解析器返回的任何关键字来定位其他可能的答案。从问题匹配器和关键词搜索器返回的答案被呈现给用户以确认哪个答案最好地表示用户在输入初始搜索查询时的意图。搜索引擎架构将查询记录，返回给用户的答案以及用户在日志数据库中的确认反馈记录。搜索引擎有一个日志分析器来评估日志数据库以收集信息，通过训练解析器和问题匹配器来提高搜索引擎的性能。

7. 发明授权

US07165019B1 Language input architecture for converting one text form to another text form with modeless entry 有权
标题翻译：语言输入架构，用于将一个文本表单转换为另一个具有无模式条目的文本格式
公开(公告)号：US07165019B1
公开(公告)日：2007-01-16
申请号：US09606807
申请日：2000-06-28
申请人： Kai-Fu Lee , Zheng Chen , Jian Han
发明人： Kai-Fu Lee , Zheng Chen , Jian Han
IPC分类号： G06F17/28
CPC分类号： G06F17/273 , G06F17/2715 , G06F17/2775 , G06F17/2863
摘要： A language input architecture converts input strings of phonetic text (e.g., Chinese Pinyin) to an output string of language text (e.g., Chinese Hanzi) in a manner that minimizes typographical errors and conversion errors that occur during conversion from the phonetic text to the language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. Each typing model is trained on real data, and learns probabilities of typing errors. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string.
摘要翻译：语言输入架构将语音文本的输入字符串（例如中文拼音）转换为语言文本的输出字符串（例如汉字），以最小化从语音文本转换为语言的排版错误和转换错误文本。语言输入体系结构具有搜索引擎，一个或多个输入模型，语言模型以及用于不同语言的一个或多个词典。每个打字模型都对真实数据进行培训，并学习打字错误的可能性。打字模型被配置为基于每个候选字符串被错误地输入作为输入字符串的可能性的概率来生成可替代输入字符串的可能的输入候选的列表。

8. 发明授权

US06848080B1 Language input architecture for converting one text form to another text form with tolerance to spelling, typographical, and conversion errors 失效
标题翻译：语言输入架构，用于将一种文本形式转换为另一种文本形式，具有拼写，排版和转换错误的容错能力
公开(公告)号：US06848080B1
公开(公告)日：2005-01-25
申请号：US09606660
申请日：2000-06-28
申请人： Kai-Fu Lee , Zheng Chen , Jian Han
发明人： Kai-Fu Lee , Zheng Chen , Jian Han
IPC分类号： G06F17/21 , G06F17/22 , G06F17/27 , G06F17/28 , G06F17/24
CPC分类号： G06F17/273 , G06F17/2223 , G06F17/2715 , G06F17/2818 , G06F17/2863
摘要： A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
摘要翻译：语言输入架构将语音文本的输入字符串转换为语言文本的输出字符串。语言输入体系结构具有搜索引擎，一个或多个输入模型，语言模型以及用于不同语言的一个或多个词典。打字模型被配置为基于每个候选字符串被错误地输入作为输入字符串的可能性的概率来生成可替代输入字符串的可能的输入候选的列表。语言模型基于可能的转换输出字符串表示候选字符串的可能性的概率，为每个输入候选提供可能的转换字符串。搜索引擎结合了打字和语言模型的概率，以找到表示输入字符串的转换形式的最可能的转换字符串。

9. 发明授权

US5577135A Handwriting signal processing front-end for handwriting recognizers 失效
标题翻译：手写信号处理前端用于手写识别
公开(公告)号：US5577135A
公开(公告)日：1996-11-19
申请号：US204031
申请日：1994-03-01
申请人： Kamil A. Grajski , Yen-Lu Chow , Kai-Fu Lee
发明人： Kamil A. Grajski , Yen-Lu Chow , Kai-Fu Lee
IPC分类号： G06K9/22 , G06K9/62 , G06K9/36 , G06K9/00
CPC分类号： G06K9/00422 , G06K9/6218
摘要： A handwriting signal processing front-end method and apparatus for a handwriting training and recognition system which includes non-uniform segmentation and feature extraction in combination with multiple vector quantization. In a training phase, digitized handwriting samples are partitioned into segments of unequal length. Features are extracted from the segments and are grouped to form feature vectors for each segment. Groups of adjacent from feature vectors are then combined to form input frames. Feature-specific vectors are formed by grouping features of the same type from each of the feature vectors within a frame. Multiple vector quantization is then performed on each feature-specific vector to statistically model the distributions of the vectors for each feature by identifying clusters of the vectors and determining the mean locations of the vectors in the clusters. Each mean location is represented by a codebook symbol and this information is stored in a codebook for each feature. These codebooks are then used to train a recognition system. In the testing phase, where the recognition system is to identify handwriting, digitized test handwriting is first processed as in the training phase to generate feature-specific vectors from input frames. Multiple vector quantization is then performed on each feature-specific vector to represent the feature-specific vector using the codebook symbols that were generated for that feature during training. The resulting series of codebook symbols effects a reduced representation of the sampled handwriting data and is used for subsequent handwriting recognition.
摘要翻译：一种用于手写训练和识别系统的手写信号处理前端方法和装置，其包括与多个矢量量化相结合的非均匀分割和特征提取。在训练阶段，数字化手写样本被划分成不等长的段。从段中提取特征，并将其分组以形成每个段的特征向量。然后组合来自特征向量的相邻组以形成输入帧。特征向量通过从帧内的每个特征向量分组相同类型的特征来形成。然后对每个特征向量执行多向量量化，以通过识别向量的簇并确定簇中的向量的平均位置来统计地对每个特征的向量的分布进行建模。每个平均位置由码本符号表示，并且该信息存储在每个特征的码本中。然后将这些码本用于训练识别系统。在识别系统识别笔迹的测试阶段，数字化测试笔迹首先在训练阶段进行处理，以从输入框中生成特征向量。然后对每个特征向量执行多向量量化，以使用在训练期间为该特征生成的码本符号来表示特征向量。所得到的一系列码本符号影响了采样笔迹数据的缩小表示，并被用于随后的手写识别。

10. 发明授权

US07302640B2 Language input architecture for converting one text form to another text form with tolerance to spelling, typographical, and conversion errors 失效
标题翻译：语言输入架构，用于将一种文本形式转换为另一种文本形式，具有拼写，排版和转换错误的容错能力
公开(公告)号：US07302640B2
公开(公告)日：2007-11-27
申请号：US10970438
申请日：2004-10-21
申请人： Kai-Fu Lee , Zheng Chen , Jian Han
发明人： Kai-Fu Lee , Zheng Chen , Jian Han
IPC分类号： G06F15/00
CPC分类号： G06F17/273 , G06F17/2223 , G06F17/2715 , G06F17/2818 , G06F17/2863
摘要： A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, typing models, a language model, and one or more lexicons for different languages. Each typing model is trained on real data, and learns probabilities of typing errors. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
摘要翻译：语言输入架构将语音文本的输入字符串转换为语言文本的输出字符串。语言输入架构具有搜索引擎，打字模型，语言模型以及用于不同语言的一个或多个词典。每个打字模型都对真实数据进行培训，并学习打字错误的可能性。打字模型被配置为基于每个候选字符串被错误地输入作为输入字符串的可能性的概率来生成可替代输入字符串的可能的输入候选的列表。语言模型基于可能的转换输出字符串表示候选字符串的可能性的概率，为每个输入候选提供可能的转换字符串。搜索引擎结合了打字和语言模型的概率，以找到表示输入字符串的转换形式的最可能的转换字符串。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式