专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07016835B2 Speech and signal digitization by using recognition metrics to select from multiple techniques 有权
标题翻译：通过使用识别度量来选择多种技术的语音和信号数字化
公开(公告)号：US07016835B2
公开(公告)日：2006-03-21
申请号：US10323549
申请日：2002-12-19
申请人： Ellen Marie Eide , Ramesh Ambat Gopinath , Dimitri Kanevsky , Peder Andreas Olsen
发明人： Ellen Marie Eide , Ramesh Ambat Gopinath , Dimitri Kanevsky , Peder Andreas Olsen
IPC分类号： G10L15/00
CPC分类号： G10L15/32 , G10L17/26
摘要： A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information. In one implementation, input speech having very low recognition accuracy as a result of a physical speech characteristic is automatically identified and recognized using a characteristic-specific speech recognizer.
摘要翻译：公开了特征数字化方法和装置，其减少将输入信息转换为计算机可读格式的错误率。分析输入信息，并根据输入信息是否表现出影响识别精度的特定物理参数对输入信息的子集进行分类。如果输入信息表现出影响识别精度的特定物理参数，则特征特定数字化系统使用特征识别器识别输入信息，该识别器演示了给定物理参数的改进性能。如果输入信息不具有影响识别精度的特定物理参数，则特征数字化系统使用对典型输入信息执行良好的一般识别器来识别输入信息。在一个实现中，作为物理语音特征的结果具有非常低的识别精度的输入语音被使用特征语音识别器自动识别和识别。

2. 发明申请

US20080052074A1 System and method for speech separation and multi-talker speech recognition 有权
标题翻译：用于语音分离和多讲话者语音识别的系统和方法
公开(公告)号：US20080052074A1
公开(公告)日：2008-02-28
申请号：US11509939
申请日：2006-08-25
申请人： Ramesh Ambat Gopinath , John Randall Hershey , Trausti Thor Kristjansson , Peder Andreas Olsen , Steven John Rennie
发明人： Ramesh Ambat Gopinath , John Randall Hershey , Trausti Thor Kristjansson , Peder Andreas Olsen , Steven John Rennie
IPC分类号： G10L15/14
CPC分类号： G10L21/028 , G10L15/142 , G10L2021/02166
摘要： A method, and a system to execute this method is being presented for the identification and separation of sources of an acoustic signal, which signal contains a mixture of multiple simultaneous component signals. The method represents the signal with multiple discrete state-variable sequences and combines acoustic and context level dynamics to achieve the source separation. The method identifies sources by discovering those frames of the signal whose features are dominated by single sources. The signal may be the simultaneous speech of multiple speakers.
摘要翻译：正在呈现一种方法和执行该方法的系统，用于识别和分离声信号的源，该信号包含多个同时分量信号的混合。该方法表示具有多个离散状态变量序列的信号，并结合声学和上下文水平动态来实现源分离。该方法通过发现其特征由单个来源支配的信号的那些帧来识别源。该信号可以是多个扬声器的同时发声。

3. 发明授权

US07664643B2 System and method for speech separation and multi-talker speech recognition 有权
标题翻译：用于语音分离和多讲话者语音识别的系统和方法
公开(公告)号：US07664643B2
公开(公告)日：2010-02-16
申请号：US11509939
申请日：2006-08-25
申请人： Ramesh Ambat Gopinath , John Randall Hershey , Trausti Thor Kristjansson , Peder Andreas Olsen , Steven John Rennie
发明人： Ramesh Ambat Gopinath , John Randall Hershey , Trausti Thor Kristjansson , Peder Andreas Olsen , Steven John Rennie
IPC分类号： G10L15/14
CPC分类号： G10L21/028 , G10L15/142 , G10L2021/02166
摘要： A method, and a system to execute this method is being presented for the identification and separation of sources of an acoustic signal, which signal contains a mixture of multiple simultaneous component signals. The method represents the signal with multiple discrete state-variable sequences and combines acoustic and context level dynamics to achieve the source separation. The method identifies sources by discovering those frames of the signal whose features are dominated by single sources. The signal may be the simultaneous speech of multiple speakers.
摘要翻译：正在呈现一种方法和执行该方法的系统，用于识别和分离声信号的源，该信号包含多个同时分量信号的混合。该方法表示具有多个离散状态变量序列的信号，并结合声学和上下文水平动态来实现源分离。该方法通过发现其特征由单个来源支配的信号的那些帧来识别源。该信号可以是多个扬声器的同时发声。

4. 发明授权

US08386249B2 Compressing feature space transforms 有权
标题翻译：压缩特征空间转换
公开(公告)号：US08386249B2
公开(公告)日：2013-02-26
申请号：US12636033
申请日：2009-12-11
申请人： Petr Fousek , Vaibhava Goel , Etienne Marcheret , Peder Andreas Olsen
发明人： Petr Fousek , Vaibhava Goel , Etienne Marcheret , Peder Andreas Olsen
IPC分类号： G10L15/06
CPC分类号： G10L19/0212 , G10L19/032
摘要： Methods for compressing a transform associated with a feature space are presented. For example, a method for compressing a transform associated with a feature space includes obtaining the transform including a plurality of transform parameters, assigning each of a plurality of quantization levels for the plurality of transform parameters to one of a plurality of quantization values, and assigning each of the plurality of transform parameters to one of the plurality of quantization values to which one of the plurality of quantization levels is assigned. One or more of obtaining the transform, assigning of each of the plurality of quantization levels, and assigning of each of the transform parameters are implemented as instruction code executed on a processor device. Further, a Viterbi algorithm may be employed for use in non-uniform level/value assignments.
摘要翻译：提出了用于压缩与特征空间相关联的变换的方法。例如，用于压缩与特征空间相关联的变换的方法包括获得包括多个变换参数的变换，将多个变换参数的多个量化级别中的每一个分配给多个量化值中的一个，以及分配所述多个变换参数中的每一个变换为分配了所述多个量化级中的一个的所述多个量化值之一。获得变换，分配多个量化级别中的每一个以及每个变换参数的分配中的一个或多个被实现为在处理器设备上执行的指令代码。此外，维特比算法可用于非均匀级/值分配中。

5. 发明授权

US07395205B2 Dynamic language model mixtures with history-based buckets 有权
标题翻译：基于历史的桶的动态语言模型混合
公开(公告)号：US07395205B2
公开(公告)日：2008-07-01
申请号：US09782434
申请日：2001-02-13
申请人： Martin Franz , Peder Andreas Olsen
发明人： Martin Franz , Peder Andreas Olsen
IPC分类号： G10L15/14
CPC分类号： G10L15/18 , G10L15/183
摘要： In an Automatic Speech Recognition (ASR) system having at least two language models, a method is provided for combining language model scores generated by at least two language models. A list of most likely words is generated for a current word in a word sequence uttered by a speaker, and acoustic scores corresponding to the most likely words are also generated. Language model scores are computed for each of the most likely words in the list, for each of the at least two language models. A set of coefficients to be used to combine the language model scores of each of the most likely words in the list is respectively and dynamically determined, based on a context of the current word. The language model scores of each of the most likely words in the list are respectively combined to obtain a composite score for each of the most likely words in the list, using the set of coefficients determined therefor.
摘要翻译：在具有至少两种语言模型的自动语音识别（ASR）系统中，提供了一种组合由至少两种语言模型产生的语言模型得分的方法。针对由扬声器发出的单词序列中的当前单词生成最可能的单词列表，并且还生成对应于最可能单词的声学分数。对于列表中的每个最可能的单词，对于至少两种语言模型中的每一种来计算语言模型分数。基于当前单词的上下文分别动态地确定用于组合列表中每个最可能单词的语言模型分数的一组系数。分别组合列表中每个最可能的单词的语言模型分数，以使用为此确定的系数集合来获得列表中每个最可能的单词的综合得分。

6. 发明授权

US06754625B2 Augmentation of alternate word lists by acoustic confusability criterion 有权
标题翻译：通过声学混淆标准来增加替代单词列表
公开(公告)号：US06754625B2
公开(公告)日：2004-06-22
申请号：US09746892
申请日：2000-12-26
申请人： Peder Andreas Olsen , Michael Alan Picheny , Harry W. Printz , Karthik Visweswariah
发明人： Peder Andreas Olsen , Michael Alan Picheny , Harry W. Printz , Karthik Visweswariah
IPC分类号： G10L1526
CPC分类号： G10L15/06 , G10L2015/0636
摘要： There is provided a method for augmenting an alternate word list generated by a speech recognition system. The alternate word list includes at least one potentially correct word for replacing a wrongly decoded word. The method includes the step of identifying at least one acoustically confusable word with respect to the wrongly decoded word. The alternate word list is augmented with the at least one acoustically confusable word.
摘要翻译：提供了一种用于增强由语音识别系统生成的替代词汇表的方法。替代单词列表包括用于替换错误解码的单词的至少一个潜在正确的单词。该方法包括识别关于错误解码的字的至少一个声学上可混淆的词的步骤。替代单词列表用至少一个声学上可混淆的词增加。

7. 发明授权

US5864805A Method and apparatus for error correction in a continuous dictation system 失效
标题翻译：在连续口授系统中用于纠错的方法和装置
公开(公告)号：US5864805A
公开(公告)日：1999-01-26
申请号：US770390
申请日：1996-12-20
申请人： Chengjun Julian Chen , Liam David Comerford , Catalina Maria Danis , Satya Dharanipragada , Michael Daniel Monkowski , Peder Andreas Olsen , Michael Alan Picheny
发明人： Chengjun Julian Chen , Liam David Comerford , Catalina Maria Danis , Satya Dharanipragada , Michael Daniel Monkowski , Peder Andreas Olsen , Michael Alan Picheny
IPC分类号： G10L15/22 , G10L7/08
CPC分类号： G10L15/22
摘要： A continuous speech recognition system has the ability to correct errors in strings of words. The error correction method stores data in the system's internal state to update probability tables used in developing alternative lists for substitution in misrecognized text.
摘要翻译：连续语音识别系统能够纠正字串中的错误。误差校正方法将数据存储在系统的内部状态中，以更新在替代列表中使用的概率表，以便在错误识别的文本中进行替换。

8. 发明申请

US20110144991A1 Compressing Feature Space Transforms 有权
标题翻译：压缩特征空间变换
公开(公告)号：US20110144991A1
公开(公告)日：2011-06-16
申请号：US12636033
申请日：2009-12-11
申请人： Petr Fousek , Vaibhava Goel , Etienne Marcheret , Peder Andreas Olsen
发明人： Petr Fousek , Vaibhava Goel , Etienne Marcheret , Peder Andreas Olsen
IPC分类号： G10L15/06
CPC分类号： G10L19/0212 , G10L19/032
摘要： Methods for compressing a transform associated with a feature space are presented. For example, a method for compressing a transform associated with a feature space includes obtaining the transform including a plurality of transform parameters, assigning each of a plurality of quantization levels for the plurality of transform parameters to one of a plurality of quantization values, and assigning each of the plurality of transform parameters to one of the plurality of quantization values to which one of the plurality of quantization levels is assigned. One or more of obtaining the transform, assigning of each of the plurality of quantization levels, and assigning of each of the transform parameters are implemented as instruction code executed on a processor device. Further, a Viterbi algorithm may be employed for use in non-uniform level/value assignments.
摘要翻译：提出了用于压缩与特征空间相关联的变换的方法。例如，用于压缩与特征空间相关联的变换的方法包括获得包括多个变换参数的变换，将多个变换参数的多个量化级别中的每一个分配给多个量化值中的一个，以及分配所述多个变换参数中的每一个变换为分配了所述多个量化级中的一个的所述多个量化值之一。获得变换，分配多个量化级别中的每一个以及每个变换参数的分配中的一个或多个被实现为在处理器设备上执行的指令代码。此外，维特比算法可用于非均匀级/值分配中。

9. 发明授权

US07219056B2 Determining and using acoustic confusability, acoustic perplexity and synthetic acoustic word error rate 有权
标题翻译：确定和使用声学混淆，声学困惑和合成声字错误率
公开(公告)号：US07219056B2
公开(公告)日：2007-05-15
申请号：US09838449
申请日：2001-04-19
申请人： Scott Elliot Axelrod , Peder Andreas Olsen , Harry William Printz , Peter Vincent de Souza
发明人： Scott Elliot Axelrod , Peder Andreas Olsen , Harry William Printz , Peter Vincent de Souza
IPC分类号： G10L15/00 , G06F7/60
CPC分类号： G10L15/01
摘要： Two statistics are disclosed for determining the quality of language models. These statistics are called acoustic perplexity and the synthetic acoustic word error rate (SAWER), and they depend upon methods for computing the acoustic confusability of words. It is possible to substitute models of acoustic data in place of real acoustic data in order to determine acoustic confusability. An evaluation model is created, a synthesizer model is created, and a matrix is determined from the evaluation and synthesizer models. Each of the evaluation and synthesizer models is a hidden Markov model. Once the matrix is determined, a confusability calculation may be performed. Different methods are used to determine synthetic likelihoods. The confusability may be normalized and smoothed and methods are disclosed that increase the speed of performing the matrix inversion and the confusability calculation. A method for caching and reusing computations for similar words is disclosed. Acoustic perplexity and SAWER are determined and applied.
摘要翻译：披露了两项统计资料来确定语言模型的质量。这些统计数据被称为声学困惑和合成声学误码率（SAWER），并且它们依赖于计算单词的声学混淆性的方法。为了确定声学混淆性，有可能用声学数据的代替代替真实的声学数据。创建一个评估模型，创建一个合成器模型，并从评估和合成器模型中确定一个矩阵。每个评估和合成器模型都是隐马尔可夫模型。一旦矩阵被确定，可以执行混淆度计算。使用不同的方法来确定合成可能性。可混淆性可以被归一化和平滑，并且公开了增加执行矩阵求逆的速度和混合计算的方法。公开了用于缓存和重用类似词语的计算的方法。确定和应用声学困惑和SAWER。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式