专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08315799B2 Location based full address entry via speech recognition 有权
标题翻译：通过语音识别的基于位置的完整地址输入
公开(公告)号：US08315799B2
公开(公告)日：2012-11-20
申请号：US12777924
申请日：2010-05-11
申请人： Neal J. Alewine , John W. Eckhart , Peder A. Olsen , Kenneth D. White
发明人： Neal J. Alewine , John W. Eckhart , Peder A. Olsen , Kenneth D. White
IPC分类号： G01C21/32
CPC分类号： G01C21/3608
摘要： A computer implemented method, system and/or computer program product confirm an orally entered address to a mobile navigation device. The mobile navigation device receives a global positioning system (GPS) root address component from a GPS. The GPS root address component is a text name of a root location at which a mobile navigation device is currently located. The mobile navigation device receives an orally entered address that comprises an oral root address component and an oral subunit component of the oral root address component. In response to the converted root address component matching the GPS root address component, the orally entered address is partitioned into the oral subunit component and the oral root address component, and any additional speech-to-text conversion of the orally entered address after the oral root address component is terminated.
摘要翻译：计算机实现的方法，系统和/或计算机程序产品向口头输入的地址确认移动导航装置。移动导航装置从GPS接收全球定位系统（GPS）根地址分量。 GPS根地址组件是移动导航设备当前所在的根位置的文本名称。移动导航装置接收口头输入的地址，该地址包括口语根地址组件和口语根地址组件的口服子单元组件。响应于匹配GPS根地址组件的转换的根地址组件，口头输入的地址被划分成口语子单元组件和口语根地址组件，以及口头地址之后的口头输入地址的任何附加语音到文本转换根地址组件终止。

2. 发明授权

US08229744B2 Class detection scheme and time mediated averaging of class dependent models 有权
标题翻译：类依赖模型的类检测方案和时间介质平均
公开(公告)号：US08229744B2
公开(公告)日：2012-07-24
申请号：US10649909
申请日：2003-08-26
申请人： Satyanarayana Dharanipragada , Peder A. Olsen
发明人： Satyanarayana Dharanipragada , Peder A. Olsen
IPC分类号： G10L15/00
CPC分类号： G10L15/07 , G10L15/02 , G10L15/14 , G10L2015/025
摘要： A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.
摘要翻译：用于类别检测和类依赖模型的时间介导平均的方法，系统和计算机程序。描述了一种技术，以利用性别信息在培训数据中以及如何从该信息中获得女性，男性和性别独立模型。通过使用概率值来平均男性和女性高斯混合模型（GMM），避免了跨性别解码性能的显着恶化。

3. 发明授权

US06374216B1 Penalized maximum likelihood estimation methods, the baum welch algorithm and diagonal balancing of symmetric matrices for the training of acoustic models in speech recognition 失效
标题翻译：惩罚最大似然估计方法，用于语音识别中声学模型训练的对称矩阵的鲍姆韦尔算法和对角平衡
公开(公告)号：US06374216B1
公开(公告)日：2002-04-16
申请号：US09404995
申请日：1999-09-27
申请人： Charles A. Micchelli , Peder A. Olsen
发明人： Charles A. Micchelli , Peder A. Olsen
IPC分类号： G10L1508
CPC分类号： G10L15/063
摘要： A nonparametric family of density functions formed by histogram estimators for modeling acoustic vectors are used in automatic recognition of speech. A Gaussian kernel is set forth in the density estimator. When the densities are found for all the basic sounds in a training stage, an acoustic vector is assigned to a phoneme label corresponding to the highest likelihood for the basis of the decoding of acoustic vectors into text.
摘要翻译：用于建模声矢量的直方图估计器形成的非参数族密度函数用于语音的自动识别。高斯核在密度估计器中列出。当在训练阶段发现所有基本声音的密度时，声音矢量被分配给对应于最高可能性的音素标签，用于将声矢量解码为文本的基础。

4. 发明申请

US20140129226A1 PRIVACY-SENSITIVE SPEECH MODEL CREATION VIA AGGREGATION OF MULTIPLE USER MODELS 有权
标题翻译：通过多种用户模型的融合进行隐私认知语音模式创建
公开(公告)号：US20140129226A1
公开(公告)日：2014-05-08
申请号：US13668662
申请日：2012-11-05
申请人： Antonio R. Lee , Petr Novak , Peder A. Olsen , Vaibhava Goel
发明人： Antonio R. Lee , Petr Novak , Peder A. Olsen , Vaibhava Goel
IPC分类号： G10L15/04
CPC分类号： G10L15/065 , G06F21/6245 , G06F21/78 , G10L15/04 , H04L63/0407
摘要： Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.
摘要翻译：本文公开的技术包括用于更新语音识别系统的声学模型的用于隐私敏感的训练数据收集的系统和方法。在一个实施例中，系统从原始音频数据本地创建适配数据。这种适应可以包括导出的统计和/或声学模型更新参数。导出的统计和/或更新的声学模型数据随后可被发送到语音识别服务器或第三方实体。由于已经处理了音频数据和转录，所以统计数据或声学模型数据没有任何可能是人可读或机器可读的信息，例如能够重建音频数据。因此，发送到服务器的转换数据不包括个人或机密信息。然后，第三方服务器可以不间断地更新语音模型，而不会存储用户的个人和机密话语。

5. 发明授权

US08738376B1 Sparse maximum a posteriori (MAP) adaptation 有权
标题翻译：稀疏最大后验（MAP）适应
公开(公告)号：US08738376B1
公开(公告)日：2014-05-27
申请号：US13284373
申请日：2011-10-28
申请人： Vaibhava Goel , Peder A. Olsen , Steven J. Rennie , Jing Huang
发明人： Vaibhava Goel , Peder A. Olsen , Steven J. Rennie , Jing Huang
IPC分类号： G10L21/02 , G10L19/14 , G10L19/00 , G10L15/20 , G10L15/06 , G10L17/00 , G10L13/00 , G10L21/00
CPC分类号： G10L15/14 , G10L13/00 , G10L15/06 , G10L15/07 , G10L15/20 , G10L17/00 , G10L19/00 , G10L21/00 , G10L21/02
摘要： Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
摘要翻译：本文公开的技术包括使用最大后验（MAP）适应过程，该过程施加稀疏约束以基于相对较小的训练数据集来为特定用户生成声学参数适配数据。所得到的声学参数自适应数据识别来自基线声学语音模型的相对小部分声学参数的变化，而不是对所有声学参数的改变。这导致用户特定的声学参数自适应数据比完全声学模型所需的存储量小几个数量级。这提供了定制的声学语音模型，其提高了预期数据存储要求的一小部分的识别精度。

6. 发明授权

US08635067B2 Model restructuring for client and server based automatic speech recognition 失效
标题翻译：基于客户端和服务器的自动语音识别模型重组
公开(公告)号：US08635067B2
公开(公告)日：2014-01-21
申请号：US12964433
申请日：2010-12-09
申请人： Pierre Dognin , Vaibhava Goel , John R. Hershey , Peder A. Olsen
发明人： Pierre Dognin , Vaibhava Goel , John R. Hershey , Peder A. Olsen
IPC分类号： G10L15/14
CPC分类号： G10L15/144 , G10L15/30 , G10L2015/0636 , G10L2015/085
摘要： Access is obtained to a large reference acoustic model for automatic speech recognition. The large reference acoustic model has L states modeled by L mixture models, and the large reference acoustic model has N components. A desired number of components Nc, less than N, to be used in a restructured acoustic model derived from the reference acoustic model, is identified. The desired number of components Nc is selected based on a computing environment in which the restructured acoustic model is to be deployed. The restructured acoustic model also has L states. For each given one of the L mixture models in the reference acoustic model, a merge sequence is built which records, for a given cost function, sequential mergers of pairs of the components associated with the given one of the mixture models. A portion of the Nc components is assigned to each of the L states in the restructured acoustic model. The restructured acoustic model is built by, for each given one of the L states in the restructured acoustic model, applying the merge sequence to a corresponding one of the L mixture models in the reference acoustic model until the portion of the Nc components assigned to the given one of the L states is achieved.
摘要翻译：获得用于自动语音识别的大参考声学模型。大参考声学模型具有由L个混合模型建模的L状态，并且大的参考声学模型具有N个分量。识别在从参考声学模型导出的重构声学模型中使用的期望数量的小于N的分量Nc。基于要重新组织的声学模型要部署的计算环境来选择所需数量的分量Nc。重组的声学模型也有L个状态。对于参考声学模型中的每个给定的一个L混合模型，构建合并序列，其针对给定的成本函数记录与给定的混合模型相关联的成分对的顺序合并。 Nc分量的一部分被分配给重构的声学模型中的每个L状态。重构的声学模型由重构的声学模型中的每个给定的一个L状态构建，将合并序列应用于参考声学模型中的L个混合模型中的对应的一个，直到分配给给出了一个L状态。

7. 发明授权

US06804648B1 Impulsivity estimates of mixtures of the power exponential distrubutions in speech modeling 失效
标题翻译：功率指数分布在语音建模中的混合的冲动性估计
公开(公告)号：US06804648B1
公开(公告)日：2004-10-12
申请号：US09275782
申请日：1999-03-25
申请人： Sankar Basu , Charles A. Micchelli , Peder A. Olsen
发明人： Sankar Basu , Charles A. Micchelli , Peder A. Olsen
IPC分类号： G10L1528
CPC分类号： G10L15/144
摘要： A parametric family of multivariate density functions formed by mixture models from univariate functions of the type exp(−|x|&bgr;) for modeling acoustic feature vectores are used in automatic recognition of speech. The parameter &bgr; is used to measure the non-Gaussian nature of the data. &bgr; is estimated from the input data using a maximum likelihood criterion. There is a balance between &bgr; and the number of data points that must be satisfied for efficient estimation.
摘要翻译：用于建模声学特征矢量的类型为exp（ - | x |β）的单变量函数的混合模型形成的多变量密度函数的参数族被用于语音的自动识别。参数β用于测量数据的非高斯性质。使用最大似然准则从输入数据估计β。在有效估计之间必须满足beta和数据点数之间的平衡。

8. 发明申请

US20140257809A1 SPARSE MAXIMUM A POSTERIORI (MAP) ADAPTION 有权
标题翻译： SPARSE MAXIMUM A POSTERIORI（MAP）ADAPTION
公开(公告)号：US20140257809A1
公开(公告)日：2014-09-11
申请号：US14284738
申请日：2014-05-22
申请人： Vaibhava Goel , Peder A. Olsen , Steven J. Rennie , Jing Huang
发明人： Vaibhava Goel , Peder A. Olsen , Steven J. Rennie , Jing Huang
IPC分类号： G10L15/14
CPC分类号： G10L15/14 , G10L13/00 , G10L15/06 , G10L15/07 , G10L15/20 , G10L17/00 , G10L19/00 , G10L21/00 , G10L21/02
摘要： Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
摘要翻译：本文公开的技术包括使用最大后验（MAP）适应过程，该过程施加稀疏约束以基于相对较小的训练数据集来为特定用户生成声学参数适配数据。所得到的声学参数自适应数据识别来自基线声学语音模型的相对小部分声学参数的变化，而不是对所有声学参数的改变。这导致用户特定的声学参数自适应数据比完全声学模型所需的存储量小几个数量级。这提供了定制的声学语音模型，其提高了预期数据存储要求的一小部分的识别精度。

9. 发明申请

US20120150536A1 MODEL RESTRUCTURING FOR CLIENT AND SERVER BASED AUTOMATIC SPEECH RECOGNITION 失效
标题翻译：基于客户端和服务器的自动语音识别的模型重构
公开(公告)号：US20120150536A1
公开(公告)日：2012-06-14
申请号：US12964433
申请日：2010-12-09
申请人： Pierre Dognin , Vaibhava Goel , John R. Hershey , Peder A. Olsen
发明人： Pierre Dognin , Vaibhava Goel , John R. Hershey , Peder A. Olsen
IPC分类号： G10L15/00
CPC分类号： G10L15/144 , G10L15/30 , G10L2015/0636 , G10L2015/085
摘要： Access is obtained to a large reference acoustic model for automatic speech recognition. The large reference acoustic model has L states modeled by L mixture models, and the large reference acoustic model has N components. A desired number of components Nc, less than N, to be used in a restructured acoustic model derived from the reference acoustic model, is identified. The desired number of components Nc is selected based on a computing environment in which the restructured acoustic model is to be deployed. The restructured acoustic model also has L states. For each given one of the L mixture models in the reference acoustic model, a merge sequence is built which records, for a given cost function, sequential mergers of pairs of the components associated with the given one of the mixture models. A portion of the Nc components is assigned to each of the L states in the restructured acoustic model. The restructured acoustic model is built by, for each given one of the L states in the restructured acoustic model, applying the merge sequence to a corresponding one of the L mixture models in the reference acoustic model until the portion of the Nc components assigned to the given one of the L states is achieved.
摘要翻译：获得用于自动语音识别的大参考声学模型。大参考声学模型具有由L个混合模型建模的L状态，并且大的参考声学模型具有N个分量。识别在从参考声学模型导出的重构声学模型中使用的期望数量的小于N的分量Nc。基于要重新组织的声学模型要部署的计算环境来选择所需数量的分量Nc。重组的声学模型也有L个状态。对于参考声学模型中的每个给定的一个L混合模型，构建合并序列，其针对给定的成本函数记录与给定的混合模型相关联的成分对的顺序合并。 Nc分量的一部分被分配给重构的声学模型中的每个L状态。重构的声学模型由重构的声学模型中的每个给定的一个L状态构建，将合并序列应用于参考声学模型中的L个混合模型中的对应的一个，直到分配给给出了一个L状态。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式