专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

11. 发明授权

US07856351B2 Integrated speech recognition and semantic classification 有权
标题翻译：综合语音识别和语义分类
公开(公告)号：US07856351B2
公开(公告)日：2010-12-21
申请号：US11655703
申请日：2007-01-19
申请人： Sibel Yaman , Li Deng , Dong Yu , Ye-Yi Wang , Alejandro Acero
发明人： Sibel Yaman , Li Deng , Dong Yu , Ye-Yi Wang , Alejandro Acero
IPC分类号： G06F17/27
CPC分类号： G10L15/1815
摘要： A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.
摘要翻译：一种新颖的系统集成了语音识别和语义分类，从而在训练语言模型和语义分类模型时，可以考虑接受讲话语音的语音识别器中的声学分数。例如，可以定义联合关联分数，其表示声学信号的语义类别和单词序列的对应关系。联合关联分数可以包括参数，例如声信号的信号到类建模的加权参数，语言模型参数和分数，以及声学模型参数和分数。可以修改参数以相对于具有目标语义类的竞争者词序列的联合关联分数来提高具有目标语义类别的目标词序列的联合关联分数。可以设计参数，使得训练数据中的语义分类误差最小化。

12. 发明申请

US20100256977A1 MAXIMUM ENTROPY MODEL WITH CONTINUOUS FEATURES 审中-公开
标题翻译：具有连续特征的最大熵模型
公开(公告)号：US20100256977A1
公开(公告)日：2010-10-07
申请号：US12416161
申请日：2009-04-01
申请人： Dong Yu , Li Deng , Alejandro Acero
发明人： Dong Yu , Li Deng , Alejandro Acero
IPC分类号： G10L15/00
CPC分类号： G10L15/14 , G06K9/6277 , G06K9/6297
摘要： Described is a technology by which a maximum entropy (MaxEnt) model, such as used as a classifier or in a conditional random field or hidden conditional random field that embed the maximum entropy model, uses continuous features with continuous weights that are continuous functions of the feature values (instead of single-valued weights). The continuous weights may be approximated by a spline-based solution. In general, this converts the optimization problem into a standard log-linear optimization problem without continuous weights at a higher-dimensional space.
摘要翻译：描述了最大熵（MaxEnt）模型，例如用作分类器或嵌入最大熵模型的条件随机场或隐藏条件随机场的最大熵（MaxEnt）模型使用具有连续权重的连续特征，连续权重是连续权重，特征值（而不是单值权重）。连续权重可以通过基于样条的解决方案近似。一般来说，这将优化问题转化为标准的对数线性优化问题，而在较高维度的空间则没有连续权重。

13. 发明申请

US20100076758A1 PHASE SENSITIVE MODEL ADAPTATION FOR NOISY SPEECH RECOGNITION 有权
标题翻译：语音识别的相敏感模型适应
公开(公告)号：US20100076758A1
公开(公告)日：2010-03-25
申请号：US12236530
申请日：2008-09-24
申请人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero
发明人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/20 , G10L15/14
CPC分类号： G10L15/065 , G10L15/20
摘要： A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model.
摘要翻译：本文描述的语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括与第一模型和第二模型通信的更新器组件，其中所述更新器组件至少部分地基于由所述第一模型输出的加法和卷积失真的联合估计来自动更新所述第二模型的参数其中，加法和卷积失真的联合估计是基于由接收器部件接收的语音发声中的相敏模型的失真估计。此外，还可以估计用于更新第二模型参数的除加法和卷积失真之外的失真，包括其他静止和非平稳源。

14. 发明申请

US20100070280A1 PARAMETER CLUSTERING AND SHARING FOR VARIABLE-PARAMETER HIDDEN MARKOV MODELS 有权
标题翻译：参数聚类和共享可变参数隐藏式MARKOV模型
公开(公告)号：US20100070280A1
公开(公告)日：2010-03-18
申请号：US12211115
申请日：2008-09-16
申请人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero
发明人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/14
CPC分类号： G10L15/142
摘要： A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.
摘要翻译：语音识别系统使用高斯混合可变参数隐马尔可夫模型（VPHMM）来识别语音。 VPHMM包括作为至少一个环境调节参数的函数而变化的高斯参数。每个高斯参数与环境条件参数的关系使用分段拟合方法建模，例如通过使用样条函数。在训练阶段，识别系统可以使用聚类来识别样条函数的类别，每个类别根据一些距离度量将彼此相似的样条函数分组在一起。识别系统然后可以存储表示各种样条函数的样条参数集合。属于类的样条函数的一个实例可以引用相关联的一组样条参数。高斯参数可以以适合以上述方式共享使用的有效形式来表示。

15. 发明授权

US07409346B2 Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction 有权
标题翻译：使用语音合成和还原的双向目标滤波模型进行语音识别的两阶段实现
公开(公告)号：US07409346B2
公开(公告)日：2008-08-05
申请号：US11069474
申请日：2005-03-01
申请人： Alejandro Acero , Dong Yu , Li Deng
发明人： Alejandro Acero , Dong Yu , Li Deng
IPC分类号： G10L15/10
CPC分类号： G10L15/02 , G10L25/15 , G10L25/24 , G10L2015/025
摘要： A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values. This then permits the use of the model for phonetic recognition.
摘要翻译：用新的两阶段实现来描述语音合成和简化的结构化生成模型。在第一阶段，使用电话序列中共振目标的先前信息产生共振峰或声道共振（VTR）的动力学。具有有限脉冲响应（FIR）的双向时间滤波作为FIR滤波器的输入应用于分段目标序列。在第二阶段，基于FIR滤波的VTR目标，分析地预测语音cepstra的动力学。这两个阶段的组合系统因此产生相关和因果相关的VTR和倒谱动力学，其中语音减少在隐藏共振空间中明确表示，并且隐含地在观察到的倒频谱空间中。组合系统还给出了电话序列的声学观察概率。使用这种概率，可以根据它们各自的概率值对不同的电话序列进行比较和排序。这样就允许使用模型进行语音识别。

16. 发明申请

US20080177547A1 Integrated speech recognition and semantic classification 有权
标题翻译：综合语音识别和语义分类
公开(公告)号：US20080177547A1
公开(公告)日：2008-07-24
申请号：US11655703
申请日：2007-01-19
申请人： Sibel Yaman , Li Deng , Dong Yu , Ye-Yi Wang , Alejandro Acero
发明人： Sibel Yaman , Li Deng , Dong Yu , Ye-Yi Wang , Alejandro Acero
IPC分类号： G10L15/18
CPC分类号： G10L15/1815
摘要： A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.
摘要翻译：一种新颖的系统集成了语音识别和语义分类，从而在训练语言模型和语义分类模型时，可以考虑接受讲话语音的语音识别器中的声学分数。例如，可以定义联合关联分数，其表示声学信号的语义类别和单词序列的对应关系。联合关联分数可以包括参数，例如声信号的信号到类建模的加权参数，语言模型参数和分数，以及声学模型参数和分数。可以修改参数以相对于具有目标语义类的竞争者词序列的联合关联分数来提高具有目标语义类别的目标词序列的联合关联分数。可以设计参数，使得训练数据中的语义分类误差最小化。

17. 发明授权

US08942978B2 Parameter learning in a hidden trajectory model 有权
标题翻译：隐藏轨迹模型中的参数学习
公开(公告)号：US08942978B2
公开(公告)日：2015-01-27
申请号：US13182971
申请日：2011-07-14
申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
IPC分类号： G10L15/00 , G10L15/06
CPC分类号： G10L15/063 , G10L2015/025
摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
摘要翻译：使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

18. 发明授权

US08239195B2 Adapting a compressed model for use in speech recognition 有权
标题翻译：适应用于语音识别的压缩模型
公开(公告)号：US08239195B2
公开(公告)日：2012-08-07
申请号：US12235748
申请日：2008-09-23
申请人： Jinyu Li , Li Deng , Dong Yu , Jian Wu , Yifan Gong , Alejandro Acero
发明人： Jinyu Li , Li Deng , Dong Yu , Jian Wu , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/20
CPC分类号： G10L15/20 , G10L15/065
摘要： A speech recognition system includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an adaptor component that selectively adapts parameters of a compressed model used to recognize at least a portion of the distorted speech utterance, wherein the adaptor component selectively adapts the parameters of the compressed model based at least in part upon the received distorted speech utterance.
摘要翻译：语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括适配器组件，所述适配器组件选择性地适配用于识别所述失真语音话语的至少一部分的压缩模型的参数，其中所述适配器组件至少部分地基于接收失真的语音话语选择性地调整所述压缩模型的参数讲话话语。

19. 发明申请

US20110270610A1 PARAMETER LEARNING IN A HIDDEN TRAJECTORY MODEL 有权
标题翻译：参数学习在隐藏的TRAJECTORY模型
公开(公告)号：US20110270610A1
公开(公告)日：2011-11-03
申请号：US13182971
申请日：2011-07-14
申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L2015/025
摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
摘要翻译：使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

20. 发明授权

US08010356B2 Parameter learning in a hidden trajectory model 有权
标题翻译：隐藏轨迹模型中的参数学习
公开(公告)号：US08010356B2
公开(公告)日：2011-08-30
申请号：US11356898
申请日：2006-02-17
申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L2015/025
摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
摘要翻译：使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式