专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US6006186A Method and apparatus for a parameter sharing speech recognition system 失效
标题翻译：一种参数共享语音识别系统的方法和装置
公开(公告)号：US6006186A
公开(公告)日：1999-12-21
申请号：US953026
申请日：1997-10-16
申请人： Ruxin Chen , Miyuki Tanaka , Duanpei Wu , Lex S. Olorenshaw
发明人： Ruxin Chen , Miyuki Tanaka , Duanpei Wu , Lex S. Olorenshaw
IPC分类号： G10L15/14 , G10L15/18 , G10L7/08
CPC分类号： G10L15/142 , G10L15/148
摘要： A method and an apparatus for a parameter sharing speech recognition system are provided. Speech signals are received into a processor of a speech recognition system. The speech signals are processed using a speech recognition system hosting a shared hidden Markov model (HMM) produced by generating a number of phoneme models, some of which are shared. The phoneme models are generated by retaining as a separate phoneme model any triphone model having a number of trained frames available that exceeds a prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having a common biphone exceed the prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having an equivalent effect on a phonemic context exceed the prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models having the same center context. The generated phoneme models are trained, and shared phoneme model states are generated that are shared among the phoneme models. Shared probability distribution functions are generated that are shared among the phoneme model states. Shared probability sub-distribution functions are generated that are shared among the phoneme model probability distribution functions. The shared phoneme model hierarchy is reevaluated for further sharing in response to the shared probability sub-distribution functions. Signals representative of the received speech signals are generated.
摘要翻译：提供了一种用于参数共享语音识别系统的方法和装置。语音信号被接收到语音识别系统的处理器中。语音信号使用一个语音识别系统进行处理，该语音识别系统承载通过生成许多音素模型而产生的共享隐马尔可夫模型（HMM），其中一些是共享的。音素模型是通过保留作为单独音素模型的任何具有超过预定阈值的已训练帧数的三音模型而产生的。生成共享音素模型以表示具有共同biphone的经过训练的帧的数量超过预定阈值的三音节音素模型组中的每一组。生成共享音素模型以表示三音节音素模型中的每一组，其中对音素上下文具有等效影响的经过训练的帧的数量超过预先指定的阈值。生成共享音素模型以表示具有相同中心上下文的三音节音素模型组中的每一组。生成的音素模型被训练，并且生成在音素模型中共享的共享音素模型状态。生成在音素模型状态之间共享的共享概率分布函数。生成在音素模型概率分布函数中共享的共享概率子分布函数。共享音素模型层次结构被重新评估以响应于共享概率子分布函数进一步共享。生成表示接收到的语音信号的信号。

2. 发明授权

US06216103B1 Method for implementing a speech recognition system to determine speech endpoints during conditions with background noise 失效
标题翻译：用于在具有背景噪声的条件下实现语音识别系统以确定语音端点的方法
公开(公告)号：US06216103B1
公开(公告)日：2001-04-10
申请号：US08957875
申请日：1997-10-20
申请人： Duanpei Wu , Miyuki Tanaka , Ruxin Chen , Lex Olorenshaw
发明人： Duanpei Wu , Miyuki Tanaka , Ruxin Chen , Lex Olorenshaw
IPC分类号： G01L300
CPC分类号： G10L25/87 , G10L15/20
摘要： A method for implementing a speech recognition system for use during conditions with background noise includes the steps of calculating, in real-time, sequential short-term delta energy parameters for speech energy from a spoken utterance, determining threshold values in the speech energy, and identifying a beginning point and an ending point for the spoken utterance based on the relationship between the threshold values and the short-term delta energy parameters.
摘要翻译：用于实现在具有背景噪声的条件期间使用的语音识别系统的方法包括以下步骤：实时地从语音话语中计算语音能量的连续短期增量能量参数，确定语音能量中的阈值，以及基于阈值和短期δ能量参数之间的关系来识别口语发音的起始点和终点。

3. 发明授权

US06718302B1 Method for utilizing validity constraints in a speech endpoint detector 失效
标题翻译：用于在语音端点检测器中使用有效性约束的方法
公开(公告)号：US06718302B1
公开(公告)日：2004-04-06
申请号：US09482396
申请日：2000-01-12
申请人： Duanpei Wu , Miyuki Tanaka , Ruxin Chen , Lex Olorenshaw
发明人： Duanpei Wu , Miyuki Tanaka , Ruxin Chen , Lex Olorenshaw
IPC分类号： G10L1102
CPC分类号： G10L25/87
摘要： A method for utilizing validity constraints in a speech endpoint detector comprises a validity manager that may utilize a pulse width module to validate utterances that include a plurality of energy pulses during a certain time period. The validity manager also may utilize a minimum power module to ensure that speech energy below a pre-determined level is not classified as a valid utterance. In addition the validity manager may use a duration module to ensure that valid utterances fall within a specified duration. Finally, the validity manager may utilize a short-utterance minimum power module to specifically distinguish an utterance of short duration from background noise based on the energy level of the short utterance.
摘要翻译：一种用于在语音端点检测器中利用有限约束的方法包括有效性管理器，其可以利用脉冲宽度模块来在特定时间段期间验证包括多个能量脉冲的话语。有效性管理器还可以利用最小功率模块来确保低于预定电平的语音能量不被分类为有效的话语。此外，有效性管理器可以使用持续时间模块来确保有效的话语落在指定的持续时间内。最后，有效性管理器可以利用短话语最小功率模块来基于短语的能量级别来特别地区分短时间的短时间与背景噪声的发音。

4. 发明授权

US06173258B2 Method for reducing noise distortions in a speech recognition system 失效
标题翻译：降低语音识别系统噪声失真的方法
公开(公告)号：US06173258B2
公开(公告)日：2001-01-09
申请号：US09177461
申请日：1998-10-22
申请人： Xavier Menendez-Pidal , Miyuki Tanaka , Ruxin Chen , Duanpei Wu
发明人： Xavier Menendez-Pidal , Miyuki Tanaka , Ruxin Chen , Duanpei Wu
IPC分类号： G10L506
CPC分类号： G10L21/0208 , G10L15/02 , G10L21/0264
摘要： A method for reducing noise distortions in a speech recognition system comprises a feature extractor that includes a noise-suppressor, one or more time cosine transforms, and a normalizer. The noise-suppressor preferably performs a spectral subtraction process early in the feature extraction procedure. The time cosine transforms preferably operate in a centered-mode to each perform a transformation in the time domain. The normalizer calculates and utilizes normalization values to generate normalized features for speech recognition. The calculated normalization values preferably include mean values, left variances and right variances.
摘要翻译：一种用于减少语音识别系统中的噪声失真的方法包括：特征提取器，其包括噪声抑制器，一个或多个时间余弦变换和归一化器。噪声抑制器优选地在特征提取过程的早期执行频谱减法处理。时间余弦变换优选地以居中模式操作，以在时域中执行变换。归一化器计算并利用归一化值来生成用于语音识别的归一化特征。计算的归一化值优选包括平均值，左方差和右方差。

5. 发明授权

US06778959B1 System and method for speech verification using out-of-vocabulary models 失效
标题翻译：使用词汇外模型进行语音验证的系统和方法
公开(公告)号：US06778959B1
公开(公告)日：2004-08-17
申请号：US09691877
申请日：2000-10-18
申请人： Duanpei Wu , Lex Olorenshaw , Xavier Menendez-Pidal , Ruxin Chen
发明人： Duanpei Wu , Lex Olorenshaw , Xavier Menendez-Pidal , Ruxin Chen
IPC分类号： G10L1314
CPC分类号： G10L15/08
摘要： A system and method for speech verification using out-of-vocabulary models includes a speech recognizer that has a model bank with system vocabulary word models, a garbage model, and one or more noise models. The model bank may reject an utterance or other sound as an invalid vocabulary word when the model bank identifies the utterance or other sound as corresponding to the garbage model or the noise models. Initial noise models may be selectively combined into a pre-determined number of final noise model clusters to effectively reduce the number of noise models that are utilized by the model bank of the speech recognizer to verify system vocabulary words.
摘要翻译：使用词汇外模型的语音验证的系统和方法包括语音识别器，其具有具有系统词汇词模型的模型库，垃圾模型和一个或多个噪声模型。当模型库识别与垃圾模型或噪声模型相对应的话语或其他声音时，模型库可以拒绝话语或其他声音作为无效词汇单词。初始噪声模型可以选择性地组合到预定数量的最终噪声模型群集中，以有效地减少由语音识别器的模型库利用以验证系统词汇单词的噪声模型的数量。

6. 发明授权

US06473735B1 System and method for speech verification using a confidence measure 失效
标题翻译：使用置信度测量语音验证的系统和方法
公开(公告)号：US06473735B1
公开(公告)日：2002-10-29
申请号：US09553985
申请日：2000-04-20
申请人： Duanpei Wu , Xavier Menendez-Pidal , Lex Olorenshaw , Ruxin Chen
发明人： Duanpei Wu , Xavier Menendez-Pidal , Lex Olorenshaw , Ruxin Chen
IPC分类号： G10L1506
CPC分类号： G10L15/10 , G10L2015/085
摘要： The present invention comprises a system and method for speech verification using a confidence measure that includes a speech verifier which compares a differential score for a recognized word to a predetermined threshold value, where a recognized word is the word model that produced the highest recognition score. In one embodiment, a single threshold is used for each word in a vocabulary. In another embodiment, each word model has an associated threshold, so that a differential score for a recognized word is compared to a unique threshold associated with that word. In a further embodiment, pairs of confused words in the vocabulary are dealt with separately. If a confused word is the recognized word, the speech verifier compares the differential score to a threshold that depends on the word model that produced the next-highest recognition score. Different values for the various thresholds may maximize rejection accuracy or recognition accuracy. A trade-off between rejection accuracy and recognition accuracy may be made by utilizing an intermediate threshold value that is between a minimum threshold value and a maximum threshold value.
摘要翻译：本发明包括一种用于使用置信度测量的语音验证的系统和方法，所述置信度测量包括将识别的词的差分得分与预定阈值进行比较的语音验证器，其中识别词是产生最高识别分数的单词模型。在一个实施例中，词汇中的每个单词使用单个阈值。在另一个实施例中，每个单词模型具有相关联的阈值，使得将识别的单词的差分分数与与该单词相关联的唯一阈值进行比较。在另一实施例中，词汇表中的混淆词对被单独处理。如果一个混淆的单词是被识别的单词，语音验证器将差分分数与取决于产生下一最高识别分数的单词模型的阈值进行比较。各种阈值的不同值可以最大化拒绝准确度或识别精度。可以通过利用处于最小阈值和最大阈值之间的中间阈值来进行拒绝准确度和识别精度之间的折衷。

7. 发明授权

US06826528B1 Weighted frequency-channel background noise suppressor 失效
标题翻译：加权频道背景噪声抑制器
公开(公告)号：US06826528B1
公开(公告)日：2004-11-30
申请号：US09691878
申请日：2000-10-18
申请人： Duanpei Wu , Miyuki Tanaka , Xavier Menendez-Pidal
发明人： Duanpei Wu , Miyuki Tanaka , Xavier Menendez-Pidal
IPC分类号： G10L2102
CPC分类号： G10L21/0208 , G10L21/0232 , G10L25/18 , G10L25/78
摘要： A method for implementing a noise suppressor in a speech recognition system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a noise calculator for calculating background noise values, a speech energy calculator for calculating speech energy values for each channel of the filter bank, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.
摘要翻译：一种用于在语音识别系统中实现噪声抑制器的方法包括：滤波器组，用于将源语音数据分离成离散频率子带以产生经滤波的信道能量;以及噪声抑制器，用于对频率子带进行加权以改善信号到噪声抑制通道能量的噪声比。噪声抑制器优选地包括用于计算背景噪声值的噪声计算器，用于计算滤波器组的每个通道的语音能量值的语音能量计算器，以及用于将计算的加权值应用于投影的通道能量以产生噪声抑制器的加权模块，抑制通道能量。

8. 发明授权

US06230122B1 Speech detection with noise suppression based on principal components analysis 失效
标题翻译：基于主成分分析的噪声抑制语音检测
公开(公告)号：US06230122B1
公开(公告)日：2001-05-08
申请号：US09176178
申请日：1998-10-21
申请人： Duanpei Wu , Miyuki Tanaka , Mariscela Amador-Hernandez
发明人： Duanpei Wu , Miyuki Tanaka , Mariscela Amador-Hernandez
IPC分类号： G10L2102
CPC分类号： G10L21/0208 , G10L21/0232
摘要： A method for effectively suppressing background noise in a speech detection system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a subspace module for using a Karhunen-Loeve transformation to create a subspace based on the background noise, a projection module for generating projected channel energy by projecting the filtered channel energy onto the created subspace, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.
摘要翻译：一种用于有效地抑制语音检测系统中的背景噪声的方法包括用于将源语音数据分离成离散频率子带以产生经滤波的信道能量的滤波器组，以及用于对频率子带进行加权以改善信号到噪声抑制通道能量的噪声比。噪声抑制器优选地包括用于使用Karhunen-Loeve变换来创建基于背景噪声的子空间的子空间模块，用于通过将滤波的信道能量投影到所创建的子空间上来产生投影通道能量的投影模块，以及用于应用的加权模块计算加权值到投影通道能量以产生噪声抑制的通道能量。

9. 发明授权

US06751588B1 Method for performing microphone conversions in a speech recognition system 失效
标题翻译：用于在语音识别系统中执行麦克风转换的方法
公开(公告)号：US06751588B1
公开(公告)日：2004-06-15
申请号：US09449424
申请日：1999-11-23
申请人： Xavier Menendez-Pidal , Miyuki Tanaka , Duanpei Wu
发明人： Xavier Menendez-Pidal , Miyuki Tanaka , Duanpei Wu
IPC分类号： G10L1506
CPC分类号： G10L15/065
摘要： A method for performing microphone conversions in a speech recognition system comprises a speech module that simultaneously captures an identical input signal using both an original microphone and a final microphone. The original microphone is also used to record an original training database. The final microphone is also used to capture input signals during normal use of the speech recognition system. A characterization module then analyzes the recorded identical input signal to generate characterization values that are subsequently utilized by a conversion module to convert the original training database into a final training database. A training program then uses the final training database to train a recognizer in the speech module in order to optimally perform a speech recognition process, in accordance with the present invention.
摘要翻译：用于在语音识别系统中执行麦克风转换的方法包括语音模块，其使用原始麦克风和最终麦克风同时捕获相同的输入信号。原始麦克风也用于记录原始的训练数据库。最后的麦克风也用于在语音识别系统的正常使用期间捕获输入信号。表征模块然后分析记录的相同输入信号以产生表征值，随后由转换模块将原始训练数据库转换成最终训练数据库。训练程序然后使用最终训练数据库来训练语音模块中的识别器，以便根据本发明最佳地执行语音识别过程。

10. 发明授权

US06272460B1 Method for implementing a speech verification system for use in a noisy environment 失效
标题翻译：用于实现在嘈杂环境中使用的语音验证系统的方法
公开(公告)号：US06272460B1
公开(公告)日：2001-08-07
申请号：US09264288
申请日：1999-03-08
申请人： Duanpei Wu , Miyuki Tanaka , Lex Olorenshaw
发明人： Duanpei Wu , Miyuki Tanaka , Lex Olorenshaw
IPC分类号： G10L1900
CPC分类号： G10L25/78
摘要： A method for implementing a speech verification system for use in a noisy environment comprises the steps of generating a confidence index for an utterance using a speech verifier, and controlling the speech verifier with a processor, wherein the utterance contains frames of sound energy. The speech verifier includes a noise suppressor, a pitch detector, and a confidence determiner. The noise suppressor suppresses noise in each frame in the utterance by summing a frequency spectrum for each frame with frequency spectra of a selected number of previous frames to produce a spectral sum. The pitch detector applies a spectral comb window to each spectral sum to produce correlation values for each frame in the utterance. The pitch detector also applies an alternate spectral comb window to each spectral sum to produce alternate correlation values for each frame in the utterance. The confidence determiner evaluates the correlation values to produce a frame confidence measure for each frame in the utterance. The confidence determiner then uses the frame confidence measures to generate the confidence index for the utterance, which indicates whether the utterance is or is not speech.
摘要翻译：用于实现在噪声环境中使用的语音验证系统的方法包括以下步骤：使用语音验证器产生用于话语的置信度指标，以及用处理器控制语音检验器，其中所述话语包含声能帧。语音检验器包括噪声抑制器，音调检测器和置信度确定器。噪声抑制器通过将每帧的频谱与选定数量的先前帧的频谱相加来抑制每个帧中的噪声，以产生频谱和。音调检测器将频谱梳窗口应用于每个频谱和，以产生话音中每帧的相关值。音调检测器还对每个频谱和应用替代频谱梳窗口，以产生话音中每帧的交替相关值。置信度确定器评估相关值以产生话语中的每个帧的帧置信度量。然后，置信度确定器使用帧置信度度量来产生话语的置信指数，这表明语音是否是语音。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式