专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06662159B2 Recognizing speech data using a state transition model 失效
标题翻译：使用状态转换模型识别语音数据
公开(公告)号：US06662159B2
公开(公告)日：2003-12-09
申请号：US08739013
申请日：1996-10-28
申请人： Yasuhiro Komori , Yasunori Ohora , Masayuki Yamada
发明人： Yasuhiro Komori , Yasunori Ohora , Masayuki Yamada
IPC分类号： G10L1514
CPC分类号： G10L15/142 , G10L2015/085
摘要： Detecting an unknown word in input speech data reduces the search space and the memory capacity for the unknown word. For this purpose, an HMM data memory stores data describing a state transition mode for the unknown word, defined by a number of states and the transition probability between the states. An output probability calculation unit acquires a state of the maximum likelihood at each time of the speech data, among the plural states employed in the state transition mode for a known word, employed in the speech recognition of the known word. The obtained result is applied to the state transition mode for the unknown word, stored in the HMM data memory, to obtain a state transition mode of the unknown word. A different output probability calculation unit determines the likelihood of the state transition mode for the known word. Then a language search unit effects the language search process, utilizing the likelihoods determined by the aforementioned two output probability calculation units, in a portion where the presence of the unknown word is permitted by the dictionary.
摘要翻译：检测输入语音数据中的未知字减少了未知单词的搜索空间和存储器容量。为此，HMM数据存储器存储描述由状态数量和状态之间的转移概率定义的未知字的状态转换模式的数据。输出概率计算单元在已知字的语音识别中采用的已知字的状态转换模式中采用的多个状态中，获取语音数据的每个时刻的最大似然度的状态。将获得的结果应用于存储在HMM数据存储器中的未知字的状态转换模式，以获得未知字的状态转换模式。不同的输出概率计算单元确定已知单词的状态转换模式的可能性。然后，语言搜索单元利用由上述两个输出概率计算单元确定的可能性，在字典允许存在未知单词的部分中实现语言搜索处理。

2. 发明授权

US06542866B1 Speech recognition method and apparatus utilizing multiple feature streams 有权
标题翻译：使用多个特征流的语音识别方法和装置
公开(公告)号：US06542866B1
公开(公告)日：2003-04-01
申请号：US09401635
申请日：1999-09-22
申请人： Li Jiang , Xuedong Huang
发明人： Li Jiang , Xuedong Huang
IPC分类号： G10L1514
CPC分类号： G10L15/02
摘要： A method and apparatus is provided for using multiple feature streams in speech recognition. In the method and apparatus, a feature extractor generates at least two feature vectors for a segment of an input signal. A decoder then generates a path score that is indicative of the probability that a word is represented by the input signal. The path score is generated by selecting the best feature vector to use for each segment. For each segment, the corresponding part in the path score for that segment is based in part on a chosen segment score that is selected from a group of at least two segment scores. The segment scores each represent a separate probability that a particular segment unit (e.g. senone, phoneme, diphone, triphone, or word) appears in that segment of the input signal. Although each segment score in the group relates to the same segment unit, the scores are based on different feature vectors for the segment.
摘要翻译：提供了一种在语音识别中使用多个特征流的方法和装置。在该方法和装置中，特征提取器为输入信号的一段生成至少两个特征向量。然后，解码器产生指示字被输入信号表示的概率的路径分数。通过选择要用于每个段的最佳特征向量来生成路径得分。对于每个段，该段的路径得分中的相应部分部分地基于从至少两个分数分数的组中选择的所选择的分数分数。分数分数各自表示单独的概率，特定的分段单位（例如，音素，音素，狄诺克，三话电话或词）出现在输入信号的该段中。虽然组中的每个分数分数与相同的分段单位相关，但分数基于分段的不同特征向量。

3. 发明授权

US06499012B1 Method and apparatus for hierarchical training of speech models for use in speaker verification 有权
标题翻译：用于说话者验证中使用的语音模型的分级训练的方法和装置
公开(公告)号：US06499012B1
公开(公告)日：2002-12-24
申请号：US09470995
申请日：1999-12-23
申请人： Stephen Douglas Peters , Matthieu Hebert , Daniel Boies
发明人： Stephen Douglas Peters , Matthieu Hebert , Daniel Boies
IPC分类号： G10L1514
CPC分类号： G10L17/04
摘要： A method and apparatus for generating a pair of data elements is provided suitable for use in a speaker verification system. The pair includes a first element representative of a speaker independent template and a second element representative of an extended speaker specific speech pattern. An audio signal forming enrollment data associated with a given speaker is received and processed to derive a speaker independent template and a speaker specific speech pattern. The speaker specific speech pattern is then processed to derive an extended speaker specific speech pattern. The extended speaker specific speech pattern includes a set of expanded speech models, each expanded speech model including a plurality of groups of states, the groups of states being linked to one another by inter-group transitions. Optionally, the expanded speech models are processed on the basis of the enrollment data to condition at least one of the plurality of inter-group transitions.
摘要翻译：提供一种用于产生一对数据元素的方法和装置，其适用于扬声器验证系统。该对包括表示扬声器独立模板的第一元件和表示扩展扬声器特定语音模式的第二元件。接收并处理与给定扬声器相关联的形成注册数据的音频信号，以导出与讲者无关的模板和说话者特定的语音模式。然后处理扬声器特定语音模式以导出扩展的说话者特定语音模式。扩展扬声器特定语音模式包括一组扩展语音模型，每个扩展语音模型包括多个状态组，所述状态组通过组间转换彼此链接。可选地，扩展语音模型基于登记数据进行处理，以便条件中的至少一个组间转换。

4. 发明授权

US06466908B1 System and method for training a class-specific hidden Markov model using a modified Baum-Welch algorithm 失效
标题翻译：使用修改的Baum-Welch算法训练类特定隐马尔可夫模型的系统和方法
公开(公告)号：US06466908B1
公开(公告)日：2002-10-15
申请号：US09484132
申请日：2000-01-14
申请人： Paul M. Baggenstoss
发明人： Paul M. Baggenstoss
IPC分类号： G10L1514
CPC分类号： G10L15/144
摘要： A system and method for training a class-specific hidden Markov model (HMM) is used for modeling physical phenomena, such as speech, characterized by a finite number of states. The method receives training data and estimates parameters of the class-specific HMM from the training data using a modified Baum-Welch algorithm, which uses likelihood ratios with respect to a common state (e.g., noise) and based on sufficient statistics for each state. The parameters are stored for use in processing signals representing the physical phenomena, for example, in speech processing applications. The modified Baum-Welch algorithm is an iterative algorithm including class-specific forward and backward procedures and HMM reestimation formulas.
摘要翻译：用于训练类特定隐马尔科夫模型（HMM）的系统和方法用于建模物理现象，例如以有限数量的状态为特征的语言。该方法使用修改的Baum-Welch算法从训练数据接收训练数据并估计类特定HMM的参数，该修改的Baum-Welch算法使用关于公共状态（例如，噪声）的似然比并且基于每个状态的足够的统计。存储参数用于处理表示物理现象的信号，例如在语音处理应用中。改进的Baum-Welch算法是一种迭代算法，包括类别特定的前向和后向过程以及HMM重新估计公式。

5. 发明授权

US06463413B1 Speech recognition training for small hardware devices 有权
标题翻译：小型硬件设备语音识别培训
公开(公告)号：US06463413B1
公开(公告)日：2002-10-08
申请号：US09295276
申请日：1999-04-20
申请人： Ted H. Applebaum , Jean-Claude Junqua
发明人： Ted H. Applebaum , Jean-Claude Junqua
IPC分类号： G10L1514
CPC分类号： G10L15/30 , G10L15/06 , G10L15/187 , G10L15/22 , G10L2015/0638
摘要： A distributed speech processing system for constructing speech recognition reference models that are to be used by a speech recognizer in a small hardware device, such as a personal digital assistant or cellular telephone. The speech processing system includes a speech recognizer residing on a first computing device and a speech model server residing on a second computing device. The speech recognizer receives speech training data and processes it into an intermediate representation of the speech training data. The intermediate representation is then communicated to the speech model server. The speech model server generates a speech reference model by using the intermediate representation of the speech training data and then communicates the speech reference model back to the first computing device for storage in a lexicon associated with the speech recognizer.
摘要翻译：一种用于构建语音识别参考模型的分布式语音处理系统，该语音识别参考模型将被诸如个人数字助理或蜂窝电话之类的小型硬件设备中的语音识别器使用。语音处理系统包括位于第一计算设备上的语音识别器和位于第二计算设备上的语音模型服务器。语音识别器接收语音训练数据并将其处理成语音训练数据的中间表示。然后将中间表示传递给语音模型服务器。语音模型服务器通过使用语音训练数据的中间表示来生成语音参考模型，然后将语音参考模型传送回第一计算设备以存储在与语音识别器相关联的词典中。

6. 发明授权

US06434522B1 Combined quantized and continuous feature vector HMM approach to speech recognition 失效
标题翻译：组合量化和连续特征向量HMM方法进行语音识别
公开(公告)号：US06434522B1
公开(公告)日：2002-08-13
申请号：US08864460
申请日：1997-05-28
申请人： Eiichi Tsuboka
发明人： Eiichi Tsuboka
IPC分类号： G10L1514
CPC分类号： G10L15/144
摘要： A device capable of achieving recognition at a high accuracy and with fewer calculations and which utilizes an HMM. The present device has a vector quantizing circuit generating a model by quantizing vectors of a training pattern having a vector series, and converting the vectors into a label series of clusters to which they belong, a continuous distribution probability density HMM generating circuit for generating a continuous distribution probability density HMM from a quantized vector series corresponding to each label of the label series, and a label incidence calculating circuit for calculating the incidence of the labels in each state from the training vectors classified in the same clusters and the continuous distribution probability density HMM.
摘要翻译：能够以高精度实现识别并且具有较少计算并且利用HMM的装置。本装置具有矢量量化电路，该矢量量化电路通过量化具有矢量序列的训练图案的矢量，并将矢量转换为它们所属的簇的标签系列，生成模型，连续分布概率密度HMM生成电路，用于产生连续的分配概率密度HMM与标签序列的每个标签对应的量化矢量序列，以及标签入射计算电路，用于根据分类在同一簇中的训练向量和连续分布概率密度HMM计算各状态下的标签的入射。

7. 发明授权

US06374220B1 N-best search for continuous speech recognition using viterbi pruning for non-output differentiation states 有权
标题翻译： N-best搜索使用维特比修剪进行非输出分化状态的连续语音识别
公开(公告)号：US06374220B1
公开(公告)日：2002-04-16
申请号：US09353969
申请日：1999-07-15
申请人： Yu-Hung Kao
发明人： Yu-Hung Kao
IPC分类号： G10L1514
CPC分类号： G10L15/083 , G10L15/08 , G10L2015/085
摘要： A method for N-best search for continuous speech recognition with limited storage space includes the steps of Viterbi pruning word level (same word, different time alignment, thus non-output differentiation) states and keeping the N-best sub-optimal paths for sentence level (output differentiation) states.
摘要翻译：一种用于具有有限存储空间的连续语音识别的N最佳搜索的方法包括维特比修剪字级（相同字，不同时间对齐，因此非输出区分）状态和保持用于句子的N最佳次优路径的步骤水平（输出差异）状态。

8. 发明授权

US06226612B1 Method of evaluating an utterance in a speech recognition system 失效
标题翻译：评价语音识别系统中话语的方法
公开(公告)号：US06226612B1
公开(公告)日：2001-05-01
申请号：US09016214
申请日：1998-01-30
申请人： Edward Srenger , Jeffrey A. Meunier , William M. Kushner
发明人： Edward Srenger , Jeffrey A. Meunier , William M. Kushner
IPC分类号： G10L1514
CPC分类号： G10L15/063 , G10L2015/088
摘要： The present invention provides a method of calculating, within the framework of a speaker dependent system, a standard filler, or garbage model, for the detection of out-of-vocabulary utterances. In particular, the method receives new training data in a speech recognition system (202); calculates statistical parameters for the new training data (204); calculates global statistical parameters based upon the statistical parameters for the new training data (206); and updates a garbage model based upon the global statistical parameters (208). This is carried out on-line while the user is enrolling the vocabulary. The garbage model described in this disclosure is preferably an average speaker model, representative of all the speech data enrolled by the user to date. Also, the garbage model is preferably obtained as a by-product of the vocabulary enrollment procedure and is similar in it characteristics and topology to all the other regular vocabulary HMMs.
摘要翻译：本发明提供了一种在扬声器依赖系统的框架内计算用于检测词汇外话语的标准填充或垃圾模型的方法。具体地，该方法在语音识别系统（202）中接收新的训练数据; 计算新训练数据的统计参数（204）; 基于新训练数据的统计参数计算全局统计参数（206）; 并基于全局统计参数（208）来更新垃圾模型。这是在用户注册词汇表的同时进行的。本公开中描述的垃圾模型优选地是代表用户登记的所有语音数据的平均说话者模型。此外，垃圾模型优选地作为词汇注册过程的副产品获得，并且其特征和拓扑与所有其他常规词汇HMM类似。

9. 发明授权

US06223158B1 Statistical option generator for alpha-numeric pre-database speech recognition correction 失效
标题翻译：用于字母数字前数据库语音识别校正的统计选项发生器
公开(公告)号：US06223158B1
公开(公告)日：2001-04-24
申请号：US09018449
申请日：1998-02-04
申请人： Randy G. Goldberg
发明人： Randy G. Goldberg
IPC分类号： G10L1514
CPC分类号： G10L15/08 , G10L2015/085
摘要： A method and apparatus for recognizing an input identifier entered by a user. A caller enters a predetermined identifier through a voice input device. A signal representing the entered identifier is transmitted to a remote recognizer, which responds to the identifier signal by producing a recognized output intended to match the entered identifier. The present invention then generates a set of option identifiers, each option identifier having a possibility of matching the input identifier. The set of option identifiers is then reduced to a set of candidate identifiers by eliminating those option identifiers that are not found among a set of stored reference identifiers. The present invention selects a match for the input identifier from the set of candidate identifiers.
摘要翻译：一种用于识别由用户输入的输入标识符的方法和装置。呼叫者通过语音输入设备输入预定的标识符。表示输入的标识符的信号被发送到远程识别器，远程识别器通过产生用于匹配输入的标识符的识别的输出来响应标识符信号。本发明然后生成一组选项标识符，每个选项标识符具有匹配输入标识符的可能性。然后，通过消除在一组存储的参考标识符之间未找到的那些选项标识符，将该组选项标识符减少到一组候选标识符。本发明从候选标识符集合中选择输入标识符的匹配。

10. 发明授权

US06202047B1 Method and apparatus for speech recognition using second order statistics and linear estimation of cepstral coefficients 失效
标题翻译：使用二阶统计学和倒谱系数的线性估计的语音识别方法和装置
公开(公告)号：US06202047B1
公开(公告)日：2001-03-13
申请号：US09050301
申请日：1998-03-30
申请人： Yariv Ephraim , Mazin G. Rahim
发明人： Yariv Ephraim , Mazin G. Rahim
IPC分类号： G10L1514
CPC分类号： G10L15/02 , G10L15/142 , G10L25/24
摘要： A method and apparatus for speech recognition using second order statistics and linear estimation of cepstral coefficients. In one embodiment, a speech input signal is received and cepstral features are extracted. An answer is generated using the extracted cepstral features and a fixed signal independent diagonal matrix as the covariance matrix for the cepstral components of the speech input signal and, for example, a hidden Markov model. In another embodiment, a noisy speech input signal is received and a cepstral vector representing a clean speech input signal is generated based on the noisy speech input signal and an explicit linear minimum mean square error cepstral estimator.
摘要翻译：一种使用二阶统计学和倒谱系数线性估计的语音识别的方法和装置。在一个实施例中，接收语音输入信号并提取倒谱特征。使用提取的倒谱特征和固定信号独立对角矩阵作为用于语音输入信号的倒谱分量的协方差矩阵和例如隐马尔可夫模型来生成答案。在另一个实施例中，接收噪声语音输入信号，并且基于噪声语音输入信号和显式线性最小均方误差倒谱估计器产生表示干净语音输入信号的倒谱矢量。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式