专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07801727B2 System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies 有权
标题翻译：用于具有大词汇的自动语音识别的声学和语言建模的系统和方法
公开(公告)号：US07801727B2
公开(公告)日：2010-09-21
申请号：US11064643
申请日：2005-02-24
申请人： Ponani Gopalakrishnan , Dimitri Kanevsky , Michael Daniel Monkowski , Jan Sedivy
发明人： Ponani Gopalakrishnan , Dimitri Kanevsky , Michael Daniel Monkowski , Jan Sedivy
IPC分类号： G10L15/04
CPC分类号： G10L15/197 , G06F17/27 , G10L15/183 , Y10S707/99942
摘要： A method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms is disclosed. The method includes: partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms; and in at least one of the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components. Also disclosed is a method for use in speech recognition including: splitting an acoustic vocabulary comprising baseforms into baseform components and storing the baseform components; and, performing sound to spelling mapping on the baseform components so as to generate a baseform components to word parts table for use in subsequent decoding of speech. A method for decoding a speech utterance using language model components and acoustic components, includes the steps of: generating from the utterance a stack of baseform component paths; concatenating baseform components in a path to generate concatenated baseforms, when the concatenated baseform components correspond to a baseform found in an acoustic vocabulary; mapping the concatenated baseforms into words; computing language model (LM) scores associated with the words using a language model, and performing further decoding of the utterance based thereupon.
摘要翻译：公开了一种用于生成具有多个单词形式的语言词汇V的语音识别系统的语言组件词汇VC的方法。该方法包括：基于各个词形式的出现频率将语言词汇V划分成单词形式的子集; 并且在至少一个子集中，分割具有小于阈值的频率的字形式，从而生成词形分量。还公开了一种用于语音识别的方法，包括：将包含基本形式的声学词汇分解成基本形式组件并存储基本形式组件; 并且对基本形式组件执行声音拼写映射，以便生成用于语音后续解码中的字部分表的基本形式分量。一种使用语言模型分量和声学分量对语音发音进行解码的方法，包括以下步骤：从发音中产生一叠基础分量路径; 当级联的基本形式组件对应于在声学词汇中发现的基础形式时，将路径中的基本形式组件连接以生成级联的基本形式; 将连接的基本形式映射为单词; 与使用语言模型的单词相关联的计算语言模型（LM）得分，并且基于此进行对话语的进一步解码。

2. 发明授权

US06928404B1 System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies 有权
标题翻译：用于具有大词汇的自动语音识别的声学和语言建模的系统和方法
公开(公告)号：US06928404B1
公开(公告)日：2005-08-09
申请号：US09271469
申请日：1999-03-17
申请人： Ponani Gopalakrishnan , Dimitri Kanevsky , Michael Daniel Monkowski , Jan Sedivy
发明人： Ponani Gopalakrishnan , Dimitri Kanevsky , Michael Daniel Monkowski , Jan Sedivy
IPC分类号： G06F17/27 , G10L15/18 , G06F17/21 , G06F17/28
CPC分类号： G10L15/197 , G06F17/27 , G10L15/183 , Y10S707/99942
摘要： Systems and methods are provided for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms. One method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms includes partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms, in at least one the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components and generating a language component vocabulary VC including word forms and word form components. The resulting language component vocabulary, which includes word forms and word components, is used to generate a language model that can be efficiently implemented for real-time automatic speech recognition applications for languages with large vocabularies.
摘要翻译：提供了用于为具有多个单词形式的语言词汇V的语音识别系统生成语言组件词汇VC的系统和方法。用于生成具有多个单词形式的语言词汇V的语音识别系统的语言组件词汇VC的一种方法包括至少基于各个单词形式的出现频率将语言词汇V划分成单词形式的子集一个子集，分裂词形式具有小于阈值的频率，从而生成单词形式分量并生成包括单词形式和单词形式分量的语言组成词汇VC。所产生的包括单词形式和单词组成的语言组件词汇用于生成语言模型，该语言模型可以有效地实现用于具有大词汇的语言的实时自动语音识别应用。

3. 发明授权

US6073091A Apparatus and method for forming a filtered inflected language model for automatic speech recognition 失效
标题翻译：用于形成用于自动语音识别的滤波变形语言模型的装置和方法
公开(公告)号：US6073091A
公开(公告)日：2000-06-06
申请号：US906812
申请日：1997-08-06
申请人： Dimitri Kanevsky , Michael Daniel Monkowski , Jan Sedivy
发明人： Dimitri Kanevsky , Michael Daniel Monkowski , Jan Sedivy
IPC分类号： G10L15/18 , G06F17/28 , G10L5/06 , G10L9/00
CPC分类号： G10L15/197
摘要： A method of forming a language model for a language having a selected vocabulary of word forms comprises: (a) mapping the word forms into integer vectors in accordance with frequencies of word form occurrence; (b) partitioning the integer vectors into subsets, the subsets respectively having ranges of frequencies of word form occurrence associated therewith, the subsets being arranged in a descending order of frequency ranges; (c) respectively assigning maps to the subsets; (d) filtering a textual corpora using the maps assigned to the subsets in order to generate indexed integers; (e) determining n-gram statistics for the indexed integers; and (f) estimating n-gram language model probabilities from the n-gram statistics to form the language model.
摘要翻译：一种形成具有所选词形的语言的语言模型的方法包括：（a）根据词形发生的频率将单词形式映射成整数向量; （b）将整数向量划分成子集，子集分别具有与其相关联的字形式出现的频率范围，子集以频率范围的降序排列; （c）分别将地图分配给子集; （d）使用分配给子集的映射过滤文本语料库，以生成索引整数; （e）确定索引整数的n-gram统计; 和（f）从n-gram统计量估计n-gram语言模型概率以形成语言模型。

4. 发明授权

US5751905A Statistical acoustic processing method and apparatus for speech recognition using a toned phoneme system 失效
标题翻译：使用音调音素系统进行语音识别的统计声学处理方法和装置
公开(公告)号：US5751905A
公开(公告)日：1998-05-12
申请号：US404786
申请日：1995-03-15
申请人： Chengjun Julian Chen , Ramesh Ambat Gopinath , Michael Daniel Monkowski , Michael Alan Picheny
发明人： Chengjun Julian Chen , Ramesh Ambat Gopinath , Michael Daniel Monkowski , Michael Alan Picheny
IPC分类号： G10L15/10 , G10L11/04 , G10L15/14 , G10L15/18 , G10L5/06
CPC分类号： G10L25/90 , G10L15/142 , G10L25/06 , G10L25/15
摘要： A method and apparatus for acoustic signal processing of speech recognition, the method comprising the following components: 1) Decompose each syllable into two phonemes of comparable length and complexity, the first one being a preme, and the second one being a toneme; 2) Each toneme is assigned a tone value such as high, rising, low, falling, and untoned; 3) No tone value is assigned to premes; 4) Pitch is detected continuously and treated the same way as energy and cepstrals in a Hidden Markov Model to predict the tone of a toneme; 5) The tone of a syllable is defined as the tone of its component toneme.
摘要翻译：一种用于语音识别的声信号处理的方法和装置，所述方法包括以下部分：1）将每个音节分解成两个具有相当长度和复杂度的音素，第一个是preme，第二个音素是toneme; 2）每个toneme被分配一个音调值，如高，上升，低，下降和解除; 3）没有音调值被分配给premes; 4）在隐马尔科夫模型中，连续检测音调和能量和倒谱相同的方式来预测音调的音调; 5）音节的音调被定义为其音调的音调。

5. 发明授权

US07542901B2 Methods and apparatus for generating dialog state conditioned language models 有权
标题翻译：用于生成对话状态条件语言模型的方法和装置
公开(公告)号：US07542901B2
公开(公告)日：2009-06-02
申请号：US11509390
申请日：2006-08-24
申请人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
发明人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
IPC分类号： G10L15/06 , G10L15/18
CPC分类号： G10L15/197 , G10L15/183 , G10L2015/228
摘要： Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.
摘要翻译：提供了用于生成改进的语言建模的技术。通过对使用语言模型的对话框的状态调节语言模型来实现这种改进的建模。例如，本发明的技术可以改进用于基于自动语言的对话系统的语音识别器中使用的语言建模。对话系统的可用性的提高是由使用对话状态条件语言模型更好地识别与对话系统相关联的语音识别器的用户话语。作为示例，对话的状态可以量化为：（i）对话系统的自然语言理解部分的内部状态; 或（ii）对话系统向用户播放的提示中的单词。

6. 发明申请

US20080215329A1 Methods and Apparatus for Generating Dialog State Conditioned Language Models 有权
标题翻译：用于生成对话状态条件语言模型的方法和装置
公开(公告)号：US20080215329A1
公开(公告)日：2008-09-04
申请号：US12057646
申请日：2008-03-28
申请人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
发明人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
IPC分类号： G10L15/28
CPC分类号： G10L15/197 , G10L15/183 , G10L2015/228
摘要： Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.
摘要翻译：提供了用于生成改进的语言建模的技术。通过对使用语言模型的对话框的状态调节语言模型来实现这种改进的建模。例如，本发明的技术可以改进用于基于自动语言的对话系统的语音识别器中使用的语言建模。对话系统的可用性的提高源自使用对话状态条件语言模型更好地识别与对话系统相关联的语音识别器的用户话语。作为示例，对话的状态可以量化为：（i）对话系统的自然语言理解部分的内部状态; 或（ii）对话系统向用户播放的提示中的单词。

7. 发明授权

US07143035B2 Methods and apparatus for generating dialog state conditioned language models 有权
公开(公告)号：US07143035B2
公开(公告)日：2006-11-28
申请号：US10107723
申请日：2002-03-27
申请人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
发明人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
IPC分类号： G10L15/06 , G10L15/22
CPC分类号： G10L15/197 , G10L15/183 , G10L2015/228
摘要： Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.

8. 发明授权

US5864805A Method and apparatus for error correction in a continuous dictation system 失效
标题翻译：在连续口授系统中用于纠错的方法和装置
公开(公告)号：US5864805A
公开(公告)日：1999-01-26
申请号：US770390
申请日：1996-12-20
申请人： Chengjun Julian Chen , Liam David Comerford , Catalina Maria Danis , Satya Dharanipragada , Michael Daniel Monkowski , Peder Andreas Olsen , Michael Alan Picheny
发明人： Chengjun Julian Chen , Liam David Comerford , Catalina Maria Danis , Satya Dharanipragada , Michael Daniel Monkowski , Peder Andreas Olsen , Michael Alan Picheny
IPC分类号： G10L15/22 , G10L7/08
CPC分类号： G10L15/22
摘要： A continuous speech recognition system has the ability to correct errors in strings of words. The error correction method stores data in the system's internal state to update probability tables used in developing alternative lists for substitution in misrecognized text.
摘要翻译：连续语音识别系统能够纠正字串中的错误。误差校正方法将数据存储在系统的内部状态中，以更新在替代列表中使用的概率表，以便在错误识别的文本中进行替换。

9. 发明授权

US07853449B2 Methods and apparatus for generating dialog state conditioned language models 有权
标题翻译：用于生成对话状态条件语言模型的方法和装置
公开(公告)号：US07853449B2
公开(公告)日：2010-12-14
申请号：US12057646
申请日：2008-03-28
申请人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
发明人： Satyanarayana Dharanipragada , Michael Daniel Monkowski , Harry W. Printz , Karthik Visweswariah
IPC分类号： G10L15/06 , G10L15/10 , G10L15/18
CPC分类号： G10L15/197 , G10L15/183 , G10L2015/228
摘要： Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.
摘要翻译：提供了用于生成改进的语言建模的技术。通过对使用语言模型的对话框的状态调节语言模型来实现这种改进的建模。例如，本发明的技术可以改进用于基于自动语言的对话系统的语音识别器中使用的语言建模。对话系统的可用性的提高是由使用对话状态条件语言模型更好地识别与对话系统相关联的语音识别器的用户话语。作为示例，对话的状态可以量化为：（i）对话系统的自然语言理解部分的内部状态; 或（ii）对话系统向用户播放的提示中的单词。

10. 发明授权

US06269335B1 Apparatus and methods for identifying homophones among words in a speech recognition system 有权
标题翻译：用于在语音识别系统中识别单词之间的同音词的装置和方法
公开(公告)号：US06269335B1
公开(公告)日：2001-07-31
申请号：US09134261
申请日：1998-08-14
申请人： Abraham Ittycheriah , Stephane Herman Maes , Michael Daniel Monkowski , Jeffrey Scott Sorensen
发明人： Abraham Ittycheriah , Stephane Herman Maes , Michael Daniel Monkowski , Jeffrey Scott Sorensen
IPC分类号： G10L2100
CPC分类号： G10L15/22
摘要： A method of identifying homophones of a word uttered by a user from at least a portion of existing words of a vocabulary of a speech recognition engine comprises the steps of: a user uttering the word; decoding the uttered word; computing respective measures between the decoded word and at least a portion of the other existing vocabulary words, the respective measures indicative of acoustic similarity between the word and the at least a portion of other existing words; if at least one measure is within a threshold range, indicating, to the user, results associated with the at least one measure, the results preferably including the decoded word and the other existing vocabulary word associated with the at least one measure; and the user preferably making a selection depending on the word the user intended to utter.
摘要翻译：从语音识别引擎的词汇表的现有单词的至少一部分中识别用户发出的单词的同音词的方法包括以下步骤：用户说出该单词; 解码发音字; 计算解码字与至少一部分其他现有词汇词之间的相应度量，所述各个度量指示词与其他现有词的至少一部分之间的声学相似性; 如果至少一个度量在阈值范围内，则向用户指示与至少一个度量相关联的结果，结果优选地包括与所述至少一个度量相关联的解码词和其他现有词汇单; 并且用户优选地根据用户想要发出的词进行选择。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式