专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08942978B2 Parameter learning in a hidden trajectory model 有权
标题翻译：隐藏轨迹模型中的参数学习
公开(公告)号：US08942978B2
公开(公告)日：2015-01-27
申请号：US13182971
申请日：2011-07-14
申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
IPC分类号： G10L15/00 , G10L15/06
CPC分类号： G10L15/063 , G10L2015/025
摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
摘要翻译：使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

2. 发明授权

US08239195B2 Adapting a compressed model for use in speech recognition 有权
标题翻译：适应用于语音识别的压缩模型
公开(公告)号：US08239195B2
公开(公告)日：2012-08-07
申请号：US12235748
申请日：2008-09-23
申请人： Jinyu Li , Li Deng , Dong Yu , Jian Wu , Yifan Gong , Alejandro Acero
发明人： Jinyu Li , Li Deng , Dong Yu , Jian Wu , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/20
CPC分类号： G10L15/20 , G10L15/065
摘要： A speech recognition system includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an adaptor component that selectively adapts parameters of a compressed model used to recognize at least a portion of the distorted speech utterance, wherein the adaptor component selectively adapts the parameters of the compressed model based at least in part upon the received distorted speech utterance.
摘要翻译：语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括适配器组件，所述适配器组件选择性地适配用于识别所述失真语音话语的至少一部分的压缩模型的参数，其中所述适配器组件至少部分地基于接收失真的语音话语选择性地调整所述压缩模型的参数讲话话语。

3. 发明申请

US20110270610A1 PARAMETER LEARNING IN A HIDDEN TRAJECTORY MODEL 有权
标题翻译：参数学习在隐藏的TRAJECTORY模型
公开(公告)号：US20110270610A1
公开(公告)日：2011-11-03
申请号：US13182971
申请日：2011-07-14
申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L2015/025
摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
摘要翻译：使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

4. 发明授权

US08010356B2 Parameter learning in a hidden trajectory model 有权
标题翻译：隐藏轨迹模型中的参数学习
公开(公告)号：US08010356B2
公开(公告)日：2011-08-30
申请号：US11356898
申请日：2006-02-17
申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L2015/025
摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
摘要翻译：使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

5. 发明申请

US20100153104A1 Noise Suppressor for Robust Speech Recognition 有权
标题翻译：噪声抑制器用于强大的语音识别
公开(公告)号：US20100153104A1
公开(公告)日：2010-06-17
申请号：US12335558
申请日：2008-12-16
申请人： Dong Yu , Li Deng , Yifan Gong , Jian Wu , Alejandro Acero
发明人： Dong Yu , Li Deng , Yifan Gong , Jian Wu , Alejandro Acero
IPC分类号： G10L15/20
CPC分类号： G10L21/0208 , G10L15/20
摘要： Described is noise reduction technology generally for speech input in which a noise-suppression related gain value for the frame is determined based upon a noise level associated with that frame in addition to the signal to noise ratios (SNRs). In one implementation, a noise reduction mechanism is based upon minimum mean square error, Mel-frequency cepstra noise reduction technology. A high gain value (e.g., one) is set to accomplish little or no noise suppression when the noise level is below a threshold low level, and a low gain value set or computed to accomplish large noise suppression above a threshold high noise level. A noise-power dependent function, e.g., a log-linear interpolation, is used to compute the gain between the thresholds. Smoothing may be performed by modifying the gain value based upon a prior frame's gain value. Also described is learning parameters used in noise reduction via a step-adaptive discriminative learning algorithm.
摘要翻译：描述了通常用于语音输入的噪声降低技术，其中除了信噪比（SNR）之外，基于与该帧相关联的噪声电平来确定用于帧的噪声抑制相关增益值。在一个实现中，降噪机制基于最小均方误差，Mel-frequency cepstra降噪技术。设置高增益值（例如一个），以在噪声电平低于阈值低电平时实现很少或没有噪声抑制，以及设置或计算的低增益值，以实现高于阈值高噪声电平的大噪声抑制。使用噪声功率相关函数，例如对数线性插值来计算阈值之间的增益。可以通过基于先前帧的增益值修改增益值来执行平滑化。还描述了通过步进自适应识别学习算法在降噪中使用的学习参数。

6. 发明申请

US20100070279A1 PIECEWISE-BASED VARIABLE -PARAMETER HIDDEN MARKOV MODELS AND THE TRAINING THEREOF 有权
标题翻译：基于改进的可变参数隐藏式MARKOV模型及其训练
公开(公告)号：US20100070279A1
公开(公告)日：2010-03-18
申请号：US12211114
申请日：2008-09-16
申请人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero
发明人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/14
CPC分类号： G10L15/144
摘要： A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech under many different conditions. Each Gaussian mixture component of the VPHMMs is characterized by a mean parameter μ and a variance parameter Σ. Each of these Gaussian parameters varies as a function of at least one environmental conditioning parameter, such as, but not limited to, instantaneous signal-to-noise-ratio (SNR). The way in which a Gaussian parameter varies with the environmental conditioning parameter(s) can be approximated as a piecewise function, such as a cubic spline function. Further, the recognition system formulates the mean parameter μ and the variance parameter Σ of each Gaussian mixture component in an efficient form that accommodates the use of discriminative training and parameter sharing. Parameter sharing is carried out so that the otherwise very large number of parameters in the VPHMMs can be effectively reduced with practically feasible amounts of training data.
摘要翻译：语音识别系统使用高斯混合可变参数隐马尔可夫模型（VPHMM）来识别许多不同条件下的语音。 VPHMM的每个高斯混合分量的特征在于平均参数μ和方差参数＆Sgr。这些高斯参数中的每一个作为至少一个环境调节参数的函数而变化，例如但不限于瞬时信噪比（SNR）。高斯参数随环境条件参数变化的方式可以近似为分段函数，如三次样条函数。此外，识别系统制定均值参数μ和方差参数＆Sgr; 每个高斯混合分量以有效的形式适应使用歧视性训练和参数共享。执行参数共享，以便通过实际可行的训练数据量可以有效地减少VPHMM中非常大量的参数。

7. 发明授权

US07653535B2 Learning statistically characterized resonance targets in a hidden trajectory model 有权
标题翻译：在隐藏的轨迹模型中学习统计学上的共振目标
公开(公告)号：US07653535B2
公开(公告)日：2010-01-26
申请号：US11303899
申请日：2005-12-15
申请人： Li Deng , Dong Yu , Alejandro Acero
发明人： Li Deng , Dong Yu , Alejandro Acero
IPC分类号： G10L19/06 , G10L19/14 , G10L11/04 , G10L15/00
CPC分类号： G10L15/14 , G10L25/12 , G10L25/15 , G10L25/24
摘要： A statistical trajectory speech model is constructed where the targets for vocal tract resonances are represented as random vectors and where the mean vectors of the target distributions are estimated using a likelihood function for joint acoustic observation vectors. The target mean vectors can be estimated without formant data. To form the model, time-dependent filter parameter vectors based on time-dependent coarticulation parameters are constructed that are a function of the ordering and identity of the phones in the phone sequence in each speech utterance. The filter parameter vectors are also a function of the temporal extent of coarticulation and of the speaker's speaking effort.
摘要翻译：构建统计轨迹语音模型，其中声道共振的目标被表示为随机向量，并且使用关联声学观测向量的似然函数来估计目标分布的平均向量。可以不使用共振峰数据来估计目标平均向量。为了形成模型，构建了基于时间依赖的协方差参数的随时间依赖的滤波器参数矢量，其是每个语音话语中电话序列中的电话的排序和身份的函数。滤波器参数矢量也是协调的时间范围和说话者的说话力的函数。

8. 发明申请

US20090177468A1 SPEECH RECOGNITION WITH NON-LINEAR NOISE REDUCTION ON MEL-FREQUENCY CEPTRA 有权
标题翻译：语音识别与非线性噪声减少在频率CEPTRA
公开(公告)号：US20090177468A1
公开(公告)日：2009-07-09
申请号：US11970537
申请日：2008-01-08
申请人： Dong Yu , Alejandro Acero , James G. Droppo , Li Deng
发明人： Dong Yu , Alejandro Acero , James G. Droppo , Li Deng
IPC分类号： G10L15/20
CPC分类号： G10L15/20 , G10L15/02 , G10L21/02 , G10L25/24
摘要： In an automatic speech recognition system, a feature extractor extracts features from a speech signal, and speech is recognized by the automatic speech recognition system based on the extracted features. Noise reduction as part of the feature extractor is provided by feature enhancement in which feature-domain noise reduction in the form of Mel-frequency cepstra is provided based on the minimum means square error criterion. Specifically, the devised method takes into account the random phase between the clean speech and the mixing noise. The feature-domain noise reduction is performed in a dimension-wise fashion to the individual dimensions of the feature vectors input to the automatic speech recognition system, in order to perform environment-robust speech recognition.
摘要翻译：在自动语音识别系统中，特征提取器从语音信号中提取特征，并且基于提取的特征，通过自动语音识别系统识别语音。通过特征增强提供降噪作为特征提取器的一部分，其中基于最小均方误差准则提供了以Mel-frequency cepstra形式的特征域降噪。具体来说，设计的方法考虑了清洁语音和混合噪声之间的随机相位。为了执行环境鲁棒的语音识别，特征域噪声降低以维度方式执行到输入到自动语音识别系统的特征向量的各个维度。

9. 发明授权

US07519531B2 Speaker adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation 有权
标题翻译：演讲者自适应学习的共振目标在隐藏轨迹模型的语音合成
公开(公告)号：US07519531B2
公开(公告)日：2009-04-14
申请号：US11093833
申请日：2005-03-30
申请人： Alejandro Acero , Dong Yu , Li Deng
发明人： Alejandro Acero , Dong Yu , Li Deng
IPC分类号： G10L19/06
CPC分类号： G10L15/07 , G10L2015/0638
摘要： A computer-implemented method is provided for training a hidden trajectory model, of a speech recognition system, which generates Vocal Tract Resonance (VTR) targets. The method includes obtaining generic VTR target parameters corresponding to a generic speaker used by a target selector to generate VTR target sequences. The generic VTR target parameters are scaled for a particular speaker using a speaker-dependent scaling factor for the particular speaker to generate speaker-adaptive VTR target parameters. This scaling is performed for both the training data and the test data, and for the training data, the scaling is performed iteratively with the process of obtaining the generic targets. The computation of the scaling factor makes use of the results of a VTR tracker. The speaker-adaptive VTR target parameters for the particular speaker are then stored in order to configure the hidden trajectory model to perform speech recognition for the particular speaker using the speaker-adaptive VTR target parameters.
摘要翻译：提供了一种计算机实现的方法，用于训练产生声音轨道共振（VTR）目标的语音识别系统的隐藏轨迹模型。该方法包括获得对应于由目标选择器使用的通用扬声器生成VTR目标序列的通用VTR目标参数。使用与特定扬声器相关的扬声器相关的缩放因子来为特定扬声器对通用VTR目标参数进行缩放以产生说话者自适应VTR目标参数。对训练数据和测试数据进行该缩放，对于训练数据，通过获得通用目标的过程迭代地执行缩放。缩放因子的计算使用VTR跟踪器的结果。然后存储用于特定扬声器的扬声器自适应VTR目标参数，以便配置隐藏轨迹模型，以使用扬声器自适应VTR目标参数为特定扬声器执行语音识别。

10. 发明申请

US20080201139A1 Generic framework for large-margin MCE training in speech recognition 有权
标题翻译：语言识别中大面积MCE培训的通用框架
公开(公告)号：US20080201139A1
公开(公告)日：2008-08-21
申请号：US11708440
申请日：2007-02-20
申请人： Dong Yu , Alejandro Acero , Li Deng , Xiaodong He
发明人： Dong Yu , Alejandro Acero , Li Deng , Xiaodong He
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L2015/0631
摘要： A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.
摘要翻译：公开了一种用于训练声学模型的方法和装置。训练语料库被访问并转换成初始声学模型。对于给定初始声学模型的每个令牌，分数计算分别为正确的类和竞争类。此外，针对每个训练令牌计算样本自适应窗口带宽。从计算出的分数和采样自适应窗口带宽值，根据损失函数计算损失值。可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值，使得靠近判定边界的正确令牌的令牌到边界的距离最大化。边距可以是固定边距，也可以作为算法迭代的函数单调变化。基于计算的损失值更新声学模型。可以重复该过程，直到满足经验收敛。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式