专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20060085191A1 Method of speech recognition using time-dependent interpolation and hidden dynamic value classes 有权
标题翻译：使用时间依赖插值和隐藏动态值类的语音识别方法
公开(公告)号：US20060085191A1
公开(公告)日：2006-04-20
申请号：US11294858
申请日：2005-12-06
申请人： Li Deng , Jian-Iai Zhou , Frank Seide , Asela Gunawardana , Hagai Attias , Alejandro Acero , Xuedong Huang
发明人： Li Deng , Jian-Iai Zhou , Frank Seide , Asela Gunawardana , Hagai Attias , Alejandro Acero , Xuedong Huang
IPC分类号： G10L15/14
CPC分类号： G10L15/12 , G10L2015/025
摘要： A speech signal is decoded by determining a production-related value for a current state based on an optimal production-related value at the end of a preceding state, the optimal production-related value being selected from a set of continuous values. The production-related value is used to determine a likelihood of a phone being represented by a set of observation vectors that are aligned with a path between the preceding state and the current state. The likelihood of the phone is combined with a score from the preceding state to determine a score for the current state, the score from the preceding state being associated with a discrete class of production-related values wherein the class matches the class of the optimal production-related value.
摘要翻译：通过基于在先前状态结束时的最佳生产相关值来确定当前状态的生产相关值来解码语音信号，从一组连续值中选择最佳生产相关值。生产相关值用于确定电话由与先前状态和当前状态之间的路径对准的一组观察向量表示的可能性。电话的可能性与来自前述状态的得分组合以确定当前状态的分数，来自前一状态的分数与生产相关值的离散类相关联，其中该类与最佳生产类别匹配相关价值。

2. 发明授权

US07050975B2 Method of speech recognition using time-dependent interpolation and hidden dynamic value classes 有权
标题翻译：使用时间依赖插值和隐藏动态值类的语音识别方法
公开(公告)号：US07050975B2
公开(公告)日：2006-05-23
申请号：US10267522
申请日：2002-10-09
申请人： Li Deng , Jian-Iai Zhou , Frank Torsten Bernd Seide , Asela J. R. Gunawardana , Hagai Attias , Alejandro Acero , Xuedong Huang
发明人： Li Deng , Jian-Iai Zhou , Frank Torsten Bernd Seide , Asela J. R. Gunawardana , Hagai Attias , Alejandro Acero , Xuedong Huang
IPC分类号： G10L15/14
CPC分类号： G10L15/12 , G10L2015/025
摘要： A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.
摘要翻译：提供了一种语音识别方法，其通过使用时间相关的内插权重在前一时间通过执行生产相关动态值与生产相关目标之间的线性插值来识别生产相关动态值。隐藏的生产相关动态值用于计算与观测值相比较的预测值，以确定给定隐藏语音单元序列的观测声学的可能性。在一些实施例中，从一组连续值中选择先前时间的生产相关动态值。另外，给出隐藏语音单元序列的观测声学的可能性与前一时刻与离散类别的生产相关动态值相关联的得分组合，以确定当前语音状态的得分。

3. 发明授权

US07206741B2 Method of speech recognition using time-dependent interpolation and hidden dynamic value classes 有权
标题翻译：使用时间依赖插值和隐藏动态值类的语音识别方法
公开(公告)号：US07206741B2
公开(公告)日：2007-04-17
申请号：US11294858
申请日：2005-12-06
申请人： Li Deng , Jian-lai Zhou , Frank Torsten Bernd Seide , Asela J. R. Gunawardana , Hagai Attias , Alejandro Acero , Xuedong Huang
发明人： Li Deng , Jian-lai Zhou , Frank Torsten Bernd Seide , Asela J. R. Gunawardana , Hagai Attias , Alejandro Acero , Xuedong Huang
IPC分类号： G10L15/04
CPC分类号： G10L15/12 , G10L2015/025
摘要： A speech signal is decoded by determining a production-related value for a current state based on an optimal production-related value at the end of a preceding state, the optimal production-related value being selected from a set of continuous values. The production-related value is used to determine a likelihood of a phone being represented by a set of observation vectors that are aligned with a path between the preceding state and the current state. The likelihood of the phone is combined with a score from the preceding state to determine a score for the current state, the score from the preceding state being associated with a discrete class of production-related values wherein the class matches the class of the optimal production-related value.
摘要翻译：通过基于在先前状态结束时的最佳生产相关值来确定当前状态的生产相关值来解码语音信号，从一组连续值中选择最佳生产相关值。生产相关值用于确定电话由与先前状态和当前状态之间的路径对准的一组观察向量表示的可能性。电话的可能性与来自前述状态的得分组合以确定当前状态的分数，来自前一状态的分数与生产相关值的离散类相关联，其中该类与最佳生产类别匹配相关价值。

4. 发明授权

US06990447B2 Method and apparatus for denoising and deverberation using variational inference and strong speech models 有权
公开(公告)号：US06990447B2
公开(公告)日：2006-01-24
申请号：US09999576
申请日：2001-11-15
申请人： Hagai Attias , John Carlton Platt , Li Deng , Alejandro Acero
发明人： Hagai Attias , John Carlton Platt , Li Deng , Alejandro Acero
IPC分类号： G10L15/08 , G10L15/12 , G10L15/06 , G10L21/02
CPC分类号： G10L21/0208 , G10L2021/02082 , H04R2225/43
摘要： A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.

5. 发明申请

US20050114134A1 Method and apparatus for continuous valued vocal tract resonance tracking using piecewise linear approximations 审中-公开
标题翻译：使用分段线性近似的连续值声道共振跟踪的方法和装置
公开(公告)号：US20050114134A1
公开(公告)日：2005-05-26
申请号：US10723995
申请日：2003-11-26
申请人： Li Deng , Hagai Attias , Alejandro Acero , Leo Lee
发明人： Li Deng , Hagai Attias , Alejandro Acero , Leo Lee
IPC分类号： G10L15/10 , G10L11/00 , G10L15/02 , G10L15/14 , G10L15/28 , G10L19/06
CPC分类号： G10L25/48 , G10L25/15
摘要： A method and apparatus tracks vocal tract resonance components, including both frequencies and bandwidths, in a speech signal. The components are tracked by defining a state equation that is linear with respect to a past vocal tract resonance vector and that predicts a current vocal tract resonance vector. An observation equation is also defined that is linear with respect to a current vocal tract resonance vector and that predicts at least one component of an observation vector. The state equation, the observation equation, and a sequence of observation vectors are used to identify a sequence of vocal tract resonance vectors using Kalman filter algorithm. Under one embodiment, the observation equation is defined based on a piecewise linear approximation to a non-linear function. The parameters of the linear approximation are selected based on pre-defined regions, which are determined from a crude estimate of a vocal tract resonance vector.
摘要翻译：一种方法和装置在语音信号中跟踪声道共振分量，包括频率和频带两者。通过定义相对于过去声道共振矢量线性的状态方程并且预测当前声道共振矢量来跟踪组件。还定义了相对于当前声道共振矢量是线性的并且预测观察矢量的至少一个分量的观察方程。状态方程，观察方程和观察矢量序列用于使用卡尔曼滤波算法识别声道共振矢量序列。在一个实施例中，基于对非线性函数的分段线性近似来定义观察方程。基于由声道共振矢量的粗略估计确定的预定义区域来选择线性近似的参数。

6. 发明授权

US07480615B2 Method of speech recognition using multimodal variational inference with switching state space models 有权
标题翻译：使用多模变分推理与开关状态空间模型的语音识别方法
公开(公告)号：US07480615B2
公开(公告)日：2009-01-20
申请号：US10760937
申请日：2004-01-20
申请人： Hagai Attias , Li Deng , Leo Lee
发明人： Hagai Attias , Li Deng , Leo Lee
IPC分类号： G06F17/20 , G10L15/14 , G10L15/00 , G10L15/28 , G05B15/00
CPC分类号： G10L15/14 , G10L2015/0638
摘要： A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.
摘要翻译：开关状态空间模型的有效设置后验概率参数的方法是通过定义包含至少两个但少于所有帧的窗口来开始的。为窗口中的每个帧确定单独的后验概率参数。然后，窗口在时间上从左到右依次移位，使得它包括帧序列中的一个或多个后续帧。然后，在移位的窗口中为每个帧确定单独的后验概率参数。这种方法非常接近于更严格的解决方案，但可将计算成本节省2到3个数量级。此外，发明了一种确定开关状态空间模型中的最佳离散状态序列的方法，其直接利用逐帧的观测向量并且在时间上从左到右进行操作。

7. 发明授权

US07103541B2 Microphone array signal enhancement using mixture models 有权
标题翻译：使用混合模型的麦克风阵列信号增强
公开(公告)号：US07103541B2
公开(公告)日：2006-09-05
申请号：US10183267
申请日：2002-06-27
申请人： Hagai Attias , Li Deng
发明人： Hagai Attias , Li Deng
IPC分类号： G10L21/02
CPC分类号： G10L21/02 , G10L2021/02161
摘要： A system and method facilitating signal enhancement utilizing mixture models is provided. The invention includes a signal enhancement adaptive system having a speech model, a noise model and a plurality of adaptive filter parameters. The signal enhancement adaptive system employs probabilistic modeling to perform signal enhancement of a plurality of windowed frequency transformed input signals received, for example, for an array of microphones. The signal enhancement adaptive system incorporates information about the statistical structure of speech signals. The signal enhancement adaptive system can be embedded in an overall enhancement system which also includes components of signal windowing and frequency transformation.
摘要翻译：提供了利用混合模型促进信号增强的系统和方法。本发明包括具有语音模型，噪声模型和多个自适应滤波器参数的信号增强自适应系统。信号增强自适应系统使用概率建模来执行例如为麦克风阵列接收的多个窗口频率变换的输入信号的信号增强。信号增强自适应系统包括关于语音信号的统计结构的信息。信号增强自适应系统可以嵌入在整体增强系统中，该系统还包括信号窗口和频率变换的组件。

8. 发明申请

US20050159951A1 Method of speech recognition using multimodal variational inference with switching state space models 有权
标题翻译：使用多模变分推理与开关状态空间模型的语音识别方法
公开(公告)号：US20050159951A1
公开(公告)日：2005-07-21
申请号：US10760937
申请日：2004-01-20
申请人： Hagai Attias , Li Deng , Leo Lee
发明人： Hagai Attias , Li Deng , Leo Lee
IPC分类号： G10L15/06 , G06F7/00 , G06F17/10 , G10L15/10 , G10L15/12 , G10L15/14 , G10L15/00
CPC分类号： G10L15/14 , G10L2015/0638
摘要： A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.
摘要翻译：开关状态空间模型的有效设置后验概率参数的方法是通过定义包含至少两个但少于所有帧的窗口来开始的。为窗口中的每个帧确定单独的后验概率参数。然后，窗口在时间上从左到右依次移位，使得它包括帧序列中的一个或多个后续帧。然后，在移位的窗口中为每个帧确定单独的后验概率参数。这种方法非常接近于更严格的解决方案，但可将计算成本节省2到3个数量级。此外，发明了一种确定开关状态空间模型中的最佳离散状态序列的方法，其直接利用逐帧的观测向量并且在时间上从左到右进行操作。

9. 发明申请

US20050119887A1 Method of speech recognition using variational inference with switching state space models 失效
标题翻译：使用与开关状态空间模型的变分推理的语音识别方法
公开(公告)号：US20050119887A1
公开(公告)日：2005-06-02
申请号：US10984609
申请日：2004-11-09
申请人： Hagai Attias , Leo Lee , Li Deng
发明人： Hagai Attias , Leo Lee , Li Deng
IPC分类号： G10L15/06 , G10L15/10 , G10L15/14 , G10L15/12
CPC分类号： G10L15/14
摘要： A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
摘要翻译：开发了一种方法，其包括：1）定义用于连续值隐藏生产相关参数和观察到的语音声学的切换状态空间模型，以及2）近似提供隐藏生产相关参数序列的可能性的后验概率以及基于观察到的输入值的序列的语音单元序列。在逼近后验概率中，语音单元的边界不是固定的，而是被最佳确定。在一个实施例中，使用高斯近似的混合。在另一个实施例中，使用HMM后验近似。

10. 发明授权

US07487087B2 Method of speech recognition using variational inference with switching state space models 失效
标题翻译：使用与开关状态空间模型的变分推理的语音识别方法
公开(公告)号：US07487087B2
公开(公告)日：2009-02-03
申请号：US10984609
申请日：2004-11-09
申请人： Hagai Attias , Leo Jingyu Lee , Li Deng
发明人： Hagai Attias , Leo Jingyu Lee , Li Deng
IPC分类号： G10L15/08
CPC分类号： G10L15/14
摘要： A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
摘要翻译：开发了一种方法，其包括：1）定义用于连续值隐藏生产相关参数和观察到的语音声学的切换状态空间模型，以及2）近似提供隐藏生产相关参数序列的可能性的后验概率以及基于观察到的输入值的序列的语音单元序列。在逼近后验概率中，语音单元的边界不是固定的，而是被最佳确定。在一个实施例中，使用高斯近似的混合。在另一个实施例中，使用HMM后验近似。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式