专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20050091042A1 Sound source separation using convolutional mixing and a priori sound source knowledge 有权
公开(公告)号：US20050091042A1
公开(公告)日：2005-04-28
申请号：US10992051
申请日：2004-11-18
申请人： Alejandro Acero , Steven Altschuler , Lani Wu
发明人： Alejandro Acero , Steven Altschuler , Lani Wu
IPC分类号： G10L11/02 , G10L21/02 , H04R3/00
CPC分类号： G10L25/78 , G10L21/0264 , G10L2021/02082 , G10L2021/02161
摘要： Sound source separation, without permutation, using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed. The target sound source can be a human speaker. The reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source. The filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred. The vectors may be linear prediction vectors, among others.

2. 发明授权

US09218412B2 Searching a database of listings 有权
标题翻译：搜索列表的数据库
公开(公告)号：US09218412B2
公开(公告)日：2015-12-22
申请号：US11746847
申请日：2007-05-10
申请人： Ye-Yi Wang , Dong Yu , Yun-Cheng Ju , Alejandro Acero , Geoffrey G. Zweig
发明人： Ye-Yi Wang , Dong Yu , Yun-Cheng Ju , Alejandro Acero , Geoffrey G. Zweig
IPC分类号： G06F7/00 , G06F17/30 , G06F3/06 , G10L15/187 , G10L15/197
CPC分类号： G06F17/30663 , G06F3/0641 , G06F17/3069 , G10L15/187 , G10L15/197
摘要： A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.
摘要翻译：使用术语频率 - 逆文档频率（Tf / Idf）算法搜索具有列表而不是长文档的数据库。

3. 发明授权

US09054764B2 Sensor array beamformer post-processor 有权
标题翻译：传感器阵列波束形成器后处理器
公开(公告)号：US09054764B2
公开(公告)日：2015-06-09
申请号：US13187235
申请日：2011-07-20
申请人： Ivan Tashev , Alejandro Acero
发明人： Ivan Tashev , Alejandro Acero
IPC分类号： H04R3/00 , H04B7/08
CPC分类号： H04B7/0854
摘要： A novel beamforming post-processor technique with enhanced noise suppression capability. The present beamforming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction, resulting in minimal artifacts and musical noise.
摘要翻译：一种具有增强噪声抑制能力的新型波束成形后处理器技术。本波束形成后处理器技术是用于传感器阵列（例如麦克风阵列）的非线性后处理技术，其改善了方向性和信号分离能力。该技术在所谓的瞬时到达空间方向上工作，估计来自给定入射角或查找方向的声音的概率，并且应用时间变化的基于增益的时空滤波器来抑制来自其他方向的声音比声源方向，导致最小的伪影和音乐噪音。

4. 发明授权

US08818797B2 Dual-band speech encoding 有权
标题翻译：双频语音编码
公开(公告)号：US08818797B2
公开(公告)日：2014-08-26
申请号：US12978197
申请日：2010-12-23
申请人： Alejandro Acero , James G. Droppo, III , Michael L. Seltzer
发明人： Alejandro Acero , James G. Droppo, III , Michael L. Seltzer
IPC分类号： G10L21/00
CPC分类号： G10L19/005 , G10L15/02 , G10L19/20 , G10L21/038 , G10L2019/0001
摘要： This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.
摘要翻译：本文件描述了用于双频语音编码的各种技术。在一些实施例中，从远程实体接收第一类型的语音特征，基于第一类型的语音特征来确定第二类型的语音特征的估计，将第二类型的语音特征的估计提供给语音识别器，从语音识别器接收基于第二类型语音特征的估计的语音识别结果，将语音识别结果发送到远程实体。

5. 发明授权

US08818002B2 Robust adaptive beamforming with enhanced noise suppression 有权
标题翻译：强大的自适应波束成形，增强噪声抑制
公开(公告)号：US08818002B2
公开(公告)日：2014-08-26
申请号：US13187618
申请日：2011-07-21
申请人： Ivan Tashev , Alejandro Acero , Byung-Jun Yoon
发明人： Ivan Tashev , Alejandro Acero , Byung-Jun Yoon
IPC分类号： H04B15/00 , G01S3/86 , H04R3/00 , H04B7/08
CPC分类号： G01S3/86 , H04B7/0854 , H04R3/005 , H04R2430/20
摘要： A novel adaptive beamforming technique with enhanced noise suppression capability. The technique incorporates the sound-source presence probability into an adaptive blocking matrix. In one embodiment the sound-source presence probability is estimated based on the instantaneous direction of arrival of the input signals and voice activity detection. The technique guarantees robustness to steering vector errors without imposing ad hoc constraints on the adaptive filter coefficients. It can provide good suppression performance for both directional interference signals as well as isotropic ambient noise.
摘要翻译：一种具有增强噪声抑制能力的新型自适应波束成形技术。该技术将声源存在概率纳入自适应阻塞矩阵。在一个实施例中，基于输入信号的瞬时到达方向和语音活动检测来估计声源存在概率。该技术保证对导向矢量误差的鲁棒性，而不会对自适应滤波器系数施加自组织约束。它可以为双向干扰信号以及各向同性环境噪声提供良好的抑制性能。

6. 发明申请

US20130253930A1 FACTORED TRANSFORMS FOR SEPARABLE ADAPTATION OF ACOUSTIC MODELS 有权
标题翻译：用于可分离适应声学模型的变换
公开(公告)号：US20130253930A1
公开(公告)日：2013-09-26
申请号：US13427907
申请日：2012-03-23
申请人： Michael Lewis Seltzer , Alejandro Acero
发明人： Michael Lewis Seltzer , Alejandro Acero
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L15/07 , G10L15/20
摘要： Various technologies described herein pertain to adapting a speech recognizer to input speech data. A first linear transform can be selected from a first set of linear transforms based on a value of a first variability source corresponding to the input speech data, and a second linear transform can be selected from a second set of linear transforms based on a value of a second variability source corresponding to the input speech data. The linear transforms in the first and second sets can compensate for the first variability source and the second variability source, respectively. Moreover, the first linear transform can be applied to the input speech data to generate intermediate transformed speech data, and the second linear transform can be applied to the intermediate transformed speech data to generate transformed speech data. Further, speech can be recognized based on the transformed speech data to obtain a result.
摘要翻译：本文描述的各种技术涉及使语音识别器适应于输入语音数据。可以基于与输入语音数据相对应的第一可变性源的值从第一组线性变换中选择第一线性变换，并且可以基于第二组线性变换的值，从第二组线性变换中选择第二线性变换对应于输入语音数据的第二可变性源。第一和第二组中的线性变换可以分别补偿第一可变性源和第二可变性源。此外，可以将第一线性变换应用于输入语音数据以产生中间变换语音数据，并且可以将第二线性变换应用于中间变换语音数据以生成变换语音数据。此外，可以基于变换的语音数据来识别语音以获得结果。

7. 发明授权

US08442828B2 Conditional model for natural language understanding 有权
标题翻译：自然语言理解的条件模型
公开(公告)号：US08442828B2
公开(公告)日：2013-05-14
申请号：US11378710
申请日：2006-03-17
申请人： Ye-Yi Wang , Alejandro Acero , John Sie Yuen Lee , Milind V. Mahajan
发明人： Ye-Yi Wang , Alejandro Acero , John Sie Yuen Lee , Milind V. Mahajan
IPC分类号： G10L15/18 , G10L15/14
CPC分类号： G10L15/1822 , G06F17/2785
摘要： A conditional model is used in spoken language understanding. One such model is a conditional random field model.
摘要翻译：语言理解中使用条件模型。一个这样的模型是条件随机场模型。

8. 发明授权

US08285542B2 Adapting a language model to accommodate inputs not found in a directory assistance listing 有权
标题翻译：适应语言模型以适应在目录帮助列表中找不到的输入
公开(公告)号：US08285542B2
公开(公告)日：2012-10-09
申请号：US13027921
申请日：2011-02-15
申请人： Dong Yu , Alejandro Acero , Yun-Cheng Ju
发明人： Dong Yu , Alejandro Acero , Yun-Cheng Ju
IPC分类号： G06F17/21
CPC分类号： G10L15/063 , G10L15/197
摘要： A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.
摘要翻译：训练统计语言模型，以使用目录援助列表语料库中的数据在目录辅助系统中使用。进行计算，以确定语料库中的单词在区分列表和其他列表中的重要程度以及用户可能忽略或添加单词的可能性。使用这些计算训练语言模型。

9. 发明授权

US08214215B2 Phase sensitive model adaptation for noisy speech recognition 有权
标题翻译：嘈杂语音识别的相敏模型适应
公开(公告)号：US08214215B2
公开(公告)日：2012-07-03
申请号：US12236530
申请日：2008-09-24
申请人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero
发明人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/14
CPC分类号： G10L15/065 , G10L15/20
摘要： A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model.
摘要翻译：本文描述的语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括与第一模型和第二模型通信的更新器组件，其中所述更新器组件至少部分地基于由所述第一模型输出的加法和卷积失真的联合估计来自动更新所述第二模型的参数其中，加法和卷积失真的联合估计是基于由接收器部件接收的语音发声中的相敏模型的失真估计。此外，还可以估计用于更新第二模型参数的除加法和卷积失真之外的失真，包括其他静止和非平稳源。

10. 发明授权

US08160878B2 Piecewise-based variable-parameter Hidden Markov Models and the training thereof 有权
标题翻译：基于分段的可变参数隐马尔科夫模型及其训练
公开(公告)号：US08160878B2
公开(公告)日：2012-04-17
申请号：US12211114
申请日：2008-09-16
申请人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero
发明人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero
IPC分类号： G10L15/14 , G10L15/20
CPC分类号： G10L15/144
摘要： A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech under many different conditions. Each Gaussian mixture component of the VPHMMs is characterized by a mean parameter μ and a variance parameter Σ. Each of these Gaussian parameters varies as a function of at least one environmental conditioning parameter, such as, but not limited to, instantaneous signal-to-noise-ratio (SNR). The way in which a Gaussian parameter varies with the environmental conditioning parameter(s) can be approximated as a piecewise function, such as a cubic spline function. Further, the recognition system formulates the mean parameter μ and the variance parameter Σ of each Gaussian mixture component in an efficient form that accommodates the use of discriminative training and parameter sharing. Parameter sharing is carried out so that the otherwise very large number of parameters in the VPHMMs can be effectively reduced with practically feasible amounts of training data.
摘要翻译：语音识别系统使用高斯混合可变参数隐马尔可夫模型（VPHMM）来识别许多不同条件下的语音。 VPHMM的每个高斯混合分量的特征在于平均参数μ和方差参数＆Sgr。这些高斯参数中的每一个作为至少一个环境调节参数的函数而变化，例如但不限于瞬时信噪比（SNR）。高斯参数随环境条件参数变化的方式可以近似为分段函数，如三次样条函数。此外，识别系统制定均值参数μ和方差参数＆Sgr; 每个高斯混合分量以有效的形式适应使用歧视性训练和参数共享。执行参数共享，以便通过实际可行的训练数据量可以有效地减少VPHMM中非常大量的参数。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式