专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07467086B2 Methodology for generating enhanced demiphone acoustic models for speech recognition 失效
标题翻译：用于生成用于语音识别的增强型demiphone声学模型的方法
公开(公告)号：US07467086B2
公开(公告)日：2008-12-16
申请号：US11013888
申请日：2004-12-16
申请人： Xavier Menendez-Pidal , Lex S. Olorenshaw , Gustavo Hernandez Abrego
发明人： Xavier Menendez-Pidal , Lex S. Olorenshaw , Gustavo Hernandez Abrego
IPC分类号： G10L15/28 , G10O15/06
CPC分类号： G10L15/142 , G10L2015/022
摘要： A system and method for effectively performing speech recognition procedures includes enhanced demiphone acoustic models that a speech recognition engine utilizes to perform the speech recognition procedures. The enhanced demiphone acoustic models each have three states that are collectively arranged to form a preceding demiphone and a succeeding demiphone. An acoustic model generator may utilize a decision tree for analyzing speech context information from a training database. The acoustic model generator then effectively configures each of the enhanced demiphone acoustic models as either a succeeding-dominant enhanced demiphone acoustic model or a preceding-dominant enhanced demiphone acoustic model to accurately model speech characteristics.
摘要翻译：用于有效执行语音识别程序的系统和方法包括语音识别引擎用于执行语音识别过程的增强的声音模型。增强的脱离音器声学模型每个具有三个状态，这些状态被集体地布置成形成前一个拆卸音和后续的拆卸音。声学模型生成器可以利用决策树来分析来自训练数据库的语音上下文信息。然后，声学模型发生器将每个增强的脱离音器声学模型有效地配置为随后的主导增强的脱离音器声学模型或者先前主导的增强的脱离音器声学模型，以精确地模拟语音特征。

2. 发明授权

US07272560B2 Methodology for performing a refinement procedure to implement a speech recognition dictionary 有权
标题翻译：执行语音识别字典的细化程序的方法
公开(公告)号：US07272560B2
公开(公告)日：2007-09-18
申请号：US10805840
申请日：2004-03-22
申请人： Gustavo Hernandez Abrego , Xavier Menendez-Pidal , Lex Olorenshaw
发明人： Gustavo Hernandez Abrego , Xavier Menendez-Pidal , Lex Olorenshaw
IPC分类号： G10L15/06
CPC分类号： G10L15/187
摘要： A system and method for performing a refinement procedure to effectively implement a speech recognition dictionary for spontaneous speech recognition may include a problematic word identifier configured to divide vocabulary words from an initial speech recognition dictionary into problematic words and non-problematic words according to pre-defined identification criteria. A candidate generator may analyze the problematic words to produce one or more pronunciation candidates for each of the problematic words. An optimization module may then perform an optimization process for refining one or more pronunciation candidates according to certain optimization criteria to thereby generate optimized problematic pronunciations. A dictionary refinement manager may finally combine the optimized problematic pronunciations with non-problematic pronunciations of the non-problematic words to produce a refined speech recognition dictionary for use by the speech recognition system.
摘要翻译：用于执行精简程序以有效地实现用于自发语音识别的语音识别字典的系统和方法可以包括有问题的字标识符，其被配置为根据预定义将词汇词从初始语音识别词典分为有问题的词和无问题词识别标准。候选生成器可以分析有问题的单词以产生每个有问题的单词的一个或多个发音候选。然后，优化模块可以根据某些优化标准执行优化一个或多个发音候选的优化过程，从而产生优化的有问题的发音。字典细化管理器可以最终将优化的有问题的发音与无问题词的非有问题的发音相结合，以产生语音识别系统使用的精细语音识别字典。

3. 发明授权

US06850886B2 System and method for speech verification using an efficient confidence measure 有权
标题翻译：使用有效的置信度测量语音验证的系统和方法
公开(公告)号：US06850886B2
公开(公告)日：2005-02-01
申请号：US09872069
申请日：2001-05-31
申请人： Gustavo Hernandez Abrego , Xavier Menendez-Pidal
发明人： Gustavo Hernandez Abrego , Xavier Menendez-Pidal
IPC分类号： G10L15/00 , G10L15/10 , G10L15/14
CPC分类号： G10L15/10 , G10L2015/085
摘要： The present invention comprises a system and method for speech verification using an efficient confidence measure, and includes a speech verifier which compares a confidence measure for a recognized word to a predetermined threshold value in order to determine whether the recognized word is valid, where a recognized word corresponds to a word model that produces a highest recognition score. In accordance with the present invention, the foregoing confidence measure may be calculated using the recognition score for the recognized word and a pseudo filler score that may be based upon selected average recognition scores from an N-best list of recognition candidates.
摘要翻译：本发明包括一种使用有效置信度测量语音验证的系统和方法，并且包括语音验证器，其将识别的词的置信度与预定阈值进行比较，以便确定所识别的词是否有效，其中识别词对应于产生最高识别分数的单词模型。根据本发明，可以使用识别词的识别分数和可以基于来自识别候选的N最佳列表的所选择的平均识别分数的伪填充分数来计算上述可信度度量。

4. 发明授权

US06785648B2 System and method for performing speech recognition in cyclostationary noise environments 失效
标题翻译：在循环平稳噪声环境中执行语音识别的系统和方法
公开(公告)号：US06785648B2
公开(公告)日：2004-08-31
申请号：US09872196
申请日：2001-05-31
申请人： Xavier Menendez-Pidal , Gustavo Hernandez Abrego
发明人： Xavier Menendez-Pidal , Gustavo Hernandez Abrego
IPC分类号： G10L1520
CPC分类号： G10L15/20
摘要： A system and method for performing speech recognition in cyclostationary noise environments includes a characterization module that may access original cyclostationary noise from an intended operating environment of a speech recognition device. The characterization module may then convert the original cyclostationary noise into target stationary noise which retains characteristics of the original cyclostationary noise. A conversion module may then generate a modified training database by utilizing the target stationary noise to modify an original training database that was prepared for training a recognizer in the speech recognition device. A training module may then train the recognizer with the modified training database to thereby optimize speech recognition procedures in cyclostationary noise environments.
摘要翻译：用于在循环平稳噪声环境中执行语音识别的系统和方法包括可以从语音识别装置的预期操作环境访问原始循环平稳噪声的表征模块。然后，表征模块可以将原始的循环平稳噪声转换成保持原始循环平稳噪声特性的目标平稳噪声。转换模块然后可以通过利用目标平稳噪声来修改准备用于训练语音识别装置中的识别器的原始训练数据库来生成修改的训练数据库。然后，训练模块可以用修改的训练数据库训练识别器，从而优化循环平稳噪声环境中的语音识别程序。

5. 发明授权

US07035789B2 Supervised automatic text generation based on word classes for language modeling 失效
公开(公告)号：US07035789B2
公开(公告)日：2006-04-25
申请号：US09947114
申请日：2001-09-04
申请人： Gustavo Hernandez Abrego , Xavier Menendez-Pidal
发明人： Gustavo Hernandez Abrego , Xavier Menendez-Pidal
IPC分类号： G06F17/27 , G06F17/20 , G06F17/21
CPC分类号： G10L15/197
摘要： A system and method is provided that randomly generates text with a given structure. The structure is taken from a number of learning examples. The structure of training examples is captured by word classification and the definition of the relationships between word classes in a given language. The text generated with this procedure is intended to replicate the information given by the original learning examples. The resulting text may be used to better model the structure of a language in a stochastic language model.

6. 发明授权

US07272562B2 System and method for utilizing speech recognition to efficiently perform data indexing procedures 有权
标题翻译：利用语音识别有效执行数据索引程序的系统和方法
公开(公告)号：US07272562B2
公开(公告)日：2007-09-18
申请号：US10812560
申请日：2004-03-30
申请人： Lex Olorenshaw , Gustavo Hernandez Abrego , Eugene Koontz
发明人： Lex Olorenshaw , Gustavo Hernandez Abrego , Eugene Koontz
IPC分类号： G10L11/00
CPC分类号： G06F17/30268 , G10L15/26
摘要： A system and method for utilizing speech recognition to efficiently perform data indexing procedures includes an authoring module that coordinates an authoring procedure for creating an index file that has pattern word sets corresponding to data objects stored in a memory of a host electronic device. The pattern word sets are generated with a speech recognition engine that transforms spoken data descriptions into text data descriptions for creating the pattern word sets. The pattern word sets are associated in the index file with data object identifiers that uniquely identify the corresponding data objects. A retrieval module manages a retrieval procedure in which the speech recognition engine converts a spoken data request into a text data request. The retrieval module compares the text data request and the pattern word sets to identify a requested object identifier for locating a requested data object from among the data objects stored in the memory of the host electronic device.
摘要翻译：一种用于利用语音识别来有效地执行数据索引过程的系统和方法包括：编写模块，其协调用于创建索引文件的创作过程，该索引文件具有对应于存储在主机电子设备的存储器中的数据对象的模式字集。使用语音识别引擎生成模式词集，其将口头数据描述转换成用于创建模式词集的文本数据描述。模式字集在索引文件中与唯一标识对应的数据对象的数据对象标识符相关联。检索模块管理语音识别引擎将语音数据请求转换为文本数据请求的检索过程。检索模块将文本数据请求和模式字集合进行比较，以便从存储在主机电子设备的存储器中的数据对象中识别用于定位所请求的数据对象的请求的对象标识符。

7. 发明申请

US20110191107A1 Structure for Grammar and Dictionary Representation in Voice Recognition and Method for Simplifying Link and Node-Generated Grammars 有权
标题翻译：语音识别中的语法和词典表示结构以及简化链接和节点生成语法的方法
公开(公告)号：US20110191107A1
公开(公告)日：2011-08-04
申请号：US13031104
申请日：2011-02-18
申请人： Gustavo Hernandez Abrego , Ruxin Chen
发明人： Gustavo Hernandez Abrego , Ruxin Chen
IPC分类号： G10L15/18
CPC分类号： G10L15/193 , G10L15/285
摘要： A speech recognition engine is provided with an acoustic model and a layered grammar and dictionary library. The layered grammar and dictionary library includes a language and non-grammar layer that supplies types of rules a grammar definition layer can use and defines non-grammar the speech recognition engine should ignore. The layered grammar and dictionary library also includes a dictionary layer that defines phonetic transcriptions for word groups the speech recognition engine is meant to recognize when voice input is received. The layered grammar and dictionary library further includes a grammar definition layer that applies rules from the language and non-grammar layer to define combinations of word groups the speech recognition system is meant to recognize. Voice input is received at a speech recognition engine and is processed using the acoustic model and the layered grammar and dictionary library.
摘要翻译：语音识别引擎设有声学模型和分层语法和字典库。分层语法和字典库包括语言和非语法层，提供语法定义层可以使用的规则类型，并定义语音识别引擎应忽略的非语法。分层语法和字典库还包括字典层，其定义语音识别引擎在接收到语音输入时识别的单词组的语音转录。分层语法和字典库还包括语法定义层，其应用语言和非语法层的规则来定义语音识别系统意图识别的单词组的组合。在语音识别引擎处接收语音输入，并使用声学模型和分层语法和字典库进行处理。

8. 发明授权

US08190433B2 Structure for grammar and dictionary representation in voice recognition and method for simplifying link and node-generated grammars 有权
标题翻译：用于语音识别中的语法和字典表示的结构以及用于简化链接和节点生成的语法的方法
公开(公告)号：US08190433B2
公开(公告)日：2012-05-29
申请号：US13031104
申请日：2011-02-18
申请人： Gustavo Hernandez Abrego , Ruxin Chen
发明人： Gustavo Hernandez Abrego , Ruxin Chen
IPC分类号： G10L15/18
CPC分类号： G10L15/193 , G10L15/285
摘要： A speech recognition engine is provided with an acoustic model and a layered grammar and dictionary library. The layered grammar and dictionary library includes a language and non-grammar layer that supplies types of rules a grammar definition layer can use and defines non-grammar the speech recognition engine should ignore. The layered grammar and dictionary library also includes a dictionary layer that defines phonetic transcriptions for word groups the speech recognition engine is meant to recognize when voice input is received. The layered grammar and dictionary library further includes a grammar definition layer that applies rules from the language and non-grammar layer to define combinations of word groups the speech recognition system is meant to recognize. Voice input is received at a speech recognition engine and is processed using the acoustic model and the layered grammar and dictionary library.
摘要翻译：语音识别引擎设有声学模型和分层语法和字典库。分层语法和字典库包括语言和非语法层，提供语法定义层可以使用的规则类型，并定义语音识别引擎应忽略的非语法。分层语法和字典库还包括字典层，其定义语音识别引擎在接收到语音输入时识别的单词组的语音转录。分层语法和字典库还包括语法定义层，其应用语言和非语法层的规则来定义语音识别系统意图识别的单词组的组合。在语音识别引擎处接收语音输入，并使用声学模型和分层语法和字典库进行处理。

9. 发明授权

US07902447B1 Automatic composition of sound sequences using finite state automata 有权
标题翻译：使用有限状态自动机自动组合声音序列
公开(公告)号：US07902447B1
公开(公告)日：2011-03-08
申请号：US11542699
申请日：2006-10-03
申请人： Gustavo Hernandez Abrego
发明人： Gustavo Hernandez Abrego
IPC分类号： A63H5/00
CPC分类号： G10H1/0025
摘要： In one embodiment, a method for the automatic composition of music is disclosed. The method begins by receiving a plurality of input sound sequences containing sound frequencies with corresponding time duration. The method continues with converting the plurality of input sound sequences to a finite state automaton using a system that allows over-generation, followed by receiving exploration rules that constrain how the finite state automaton is to be traversed. The next step is creating a path marker data structure indexing a plurality of path markers, where each path marker contains a path marker history and a path marker registry. After the path marker data structure is created, the method continues by traversing the finite state automaton with a graph exploration procedure that uses the exploration rules and the plurality of path markers to determine paths across the finite state automaton. During the exploration the path marker history and the path marker registry of particular path markers are updated when traversing the finite state automaton. As the finite state automaton is traversed the method includes storing the paths across the finite state automaton to the path marker data structure to define recorded path markers, wherein the recorded path markers that are not found in the plurality of input sound sequences define new music compositions.
摘要翻译：在一个实施例中，公开了一种用于自动组合音乐的方法。该方法通过接收包含具有相应持续时间的声音频率的多个输入声音序列开始。该方法继续将多个输入声音序列转换为有限状态自动机，使用允许过度生成的系统，随后接收约束有限状态自动机如何穿过的勘探规则。下一步是创建索引多个路径标记的路径标记数据结构，其中每个路径标记包含路径标记历史和路径标记登记。在创建路径标记数据结构之后，该方法通过遍历有限状态自动机，通过使用勘探规则和多个路径标记来确定跨越有限状态自动机的路径的图形探索过程来继续。在探索期间，当遍历有限状态自动机时，更新路径标记历史和特定路径标记的路径标记注册。当有限状态自动机被遍历时，该方法包括将跨越有限状态自动机的路径存储到路径标记数据结构以定义记录的路径标记，其中在多个输入声音序列中未找到的记录路径标记定义新的音乐作品。

10. 发明授权

US08450591B2 Methods for generating new output sounds from input sounds 有权
标题翻译：从输入声音生成新的输出声音的方法
公开(公告)号：US08450591B2
公开(公告)日：2013-05-28
申请号：US13020776
申请日：2011-02-03
申请人： Gustavo Hernandez Abrego
发明人： Gustavo Hernandez Abrego
IPC分类号： A63H5/00 , G04B13/00
CPC分类号： G10H1/0025
摘要： Methods for dynamically analyzing input sounds and processing the input sounds to define a new set of output sounds are provided. One method includes receiving a first set of input sounds and a second set of input sounds, where each of the first and second sets of input sounds are processed to identify one of a tone, intensity, or frequency, and a duration. The method defines a node for each identified input sound and a link between the input sounds of the first and second sets of input sounds. The nodes and links from the first and second sets of input sounds create a respective first and second finite state automata. A history value is defined for processing the nodes of the first and second sets of input sounds, and the history value defines a number of previous nodes that will be identical in each of the first and second sets of input sounds before a particular node is shared between the first and second sets of input sounds. Then, the method forms the new set of output sounds from a third finite state automaton that includes nodes from the first and second set of input nodes and nodes that are shared based on meeting the history value.
摘要翻译：提供了动态分析输入声音和处理输入声音以定义一组新的输出声音的方法。一种方法包括接收第一组输入声音和第二组输入声音，其中处理第一和第二组输入声音中的每一个以识别音调，强度或频率中的一个以及持续时间。该方法为每个识别的输入声音定义一个节点，以及第一和第二组输入声音的输入声音之间的链接。来自第一和第二组输入声音的节点和链接产生相应的第一和第二有限状态自动机。定义历史值以处理第一和第二组输入声音的节点，并且历史值定义在特定节点被共享之前在第一组和第二组输入声音中的每一个将相同的先前节点的数量在第一组和第二组输入声音之间。然后，该方法形成来自第三有限状态自动机的新的一组输出声音，其包括来自第一和第二组输入节点的节点和基于满足历史值共享的节点。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式