专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

81. 发明授权

US07478038B2 Language model adaptation using semantic supervision 失效
标题翻译：使用语义监督的语言模型适应
公开(公告)号：US07478038B2
公开(公告)日：2009-01-13
申请号：US10814906
申请日：2004-03-31
申请人： Ciprian Chelba , Milind Mahajan , Alejandro Acero , Yik-Cheung Tam
发明人： Ciprian Chelba , Milind Mahajan , Alejandro Acero , Yik-Cheung Tam
IPC分类号： G06F17/21
CPC分类号： G06F17/27 , G10L15/1815
摘要： A method and apparatus are provided for adapting a language model. The method and apparatus provide supervised class-based adaptation of the language model utilizing in-domain semantic information.
摘要翻译：提供了一种适应语言模型的方法和装置。该方法和装置使用域内语义信息来提供对语言模型进行监督的基于类的适应。

82. 发明申请

US20080281806A1 SEARCHING A DATABASE OF LISTINGS 有权
标题翻译：搜索列表数据库
公开(公告)号：US20080281806A1
公开(公告)日：2008-11-13
申请号：US11746847
申请日：2007-05-10
申请人： Ye-Yi Wang , Dong Yu , Yun-Cheng Ju , Alejandro Acero , Geoffrey G. Zweig
发明人： Ye-Yi Wang , Dong Yu , Yun-Cheng Ju , Alejandro Acero , Geoffrey G. Zweig
IPC分类号： G06F17/30
CPC分类号： G06F17/30663 , G06F3/0641 , G06F17/3069 , G10L15/187 , G10L15/197
摘要： A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.
摘要翻译：使用术语频率 - 逆文档频率（Tf / Idf）算法搜索具有列表而不是长文档的数据库。

83. 发明授权

US07447630B2 Method and apparatus for multi-sensory speech enhancement 有权
标题翻译：多感官语音增强的方法和装置
公开(公告)号：US07447630B2
公开(公告)日：2008-11-04
申请号：US10724008
申请日：2003-11-26
申请人： Zicheng Liu , Michael J. Sinclair , Alejandro Acero , Xuedong D. Huang , James G. Droppo , Li Deng , Zhengyou Zhang , Yanli Zheng
发明人： Zicheng Liu , Michael J. Sinclair , Alejandro Acero , Xuedong D. Huang , James G. Droppo , Li Deng , Zhengyou Zhang , Yanli Zheng
IPC分类号： G10L21/02
CPC分类号： G10L21/0208 , G10L2021/02165
摘要： A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.
摘要翻译：一种方法和系统使用从除空气传导麦克风以外的传感器接收的替代传感器信号来估计干净的语音值。该估计单独使用替代传感器信号，或者与导气麦克风信号一起使用。无需使用从空气传导麦克风收集的噪声训练数据训练的模型来估计干净的语音值。在一个实施例中，校正矢量被添加到由替代传感器信号形成的矢量中，以形成滤波器，该滤波器被施加到空气传导麦克风信号以产生干净的语音估计。在其他实施例中，语音信号的音调由替代传感器信号确定，并用于分解空气传导麦克风信号。然后使用分解的信号来确定干净的信号估计。

84. 发明授权

US07409346B2 Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction 有权
标题翻译：使用语音合成和还原的双向目标滤波模型进行语音识别的两阶段实现
公开(公告)号：US07409346B2
公开(公告)日：2008-08-05
申请号：US11069474
申请日：2005-03-01
申请人： Alejandro Acero , Dong Yu , Li Deng
发明人： Alejandro Acero , Dong Yu , Li Deng
IPC分类号： G10L15/10
CPC分类号： G10L15/02 , G10L25/15 , G10L25/24 , G10L2015/025
摘要： A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values. This then permits the use of the model for phonetic recognition.
摘要翻译：用新的两阶段实现来描述语音合成和简化的结构化生成模型。在第一阶段，使用电话序列中共振目标的先前信息产生共振峰或声道共振（VTR）的动力学。具有有限脉冲响应（FIR）的双向时间滤波作为FIR滤波器的输入应用于分段目标序列。在第二阶段，基于FIR滤波的VTR目标，分析地预测语音cepstra的动力学。这两个阶段的组合系统因此产生相关和因果相关的VTR和倒谱动力学，其中语音减少在隐藏共振空间中明确表示，并且隐含地在观察到的倒频谱空间中。组合系统还给出了电话序列的声学观察概率。使用这种概率，可以根据它们各自的概率值对不同的电话序列进行比较和排序。这样就允许使用模型进行语音识别。

85. 发明申请

US20080177547A1 Integrated speech recognition and semantic classification 有权
标题翻译：综合语音识别和语义分类
公开(公告)号：US20080177547A1
公开(公告)日：2008-07-24
申请号：US11655703
申请日：2007-01-19
申请人： Sibel Yaman , Li Deng , Dong Yu , Ye-Yi Wang , Alejandro Acero
发明人： Sibel Yaman , Li Deng , Dong Yu , Ye-Yi Wang , Alejandro Acero
IPC分类号： G10L15/18
CPC分类号： G10L15/1815
摘要： A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.
摘要翻译：一种新颖的系统集成了语音识别和语义分类，从而在训练语言模型和语义分类模型时，可以考虑接受讲话语音的语音识别器中的声学分数。例如，可以定义联合关联分数，其表示声学信号的语义类别和单词序列的对应关系。联合关联分数可以包括参数，例如声信号的信号到类建模的加权参数，语言模型参数和分数，以及声学模型参数和分数。可以修改参数以相对于具有目标语义类的竞争者词序列的联合关联分数来提高具有目标语义类别的目标词序列的联合关联分数。可以设计参数，使得训练数据中的语义分类误差最小化。

86. 发明授权

US07328147B2 Automatic resolution of segmentation ambiguities in grammar authoring 有权
标题翻译：自动解决语法创作中的分歧模糊
公开(公告)号：US07328147B2
公开(公告)日：2008-02-05
申请号：US10406524
申请日：2003-04-03
申请人： YeYi Wang , Alejandro Acero
发明人： YeYi Wang , Alejandro Acero
IPC分类号： G06F17/27
CPC分类号： G06F17/271 , G10L15/18
摘要： A rules-based grammar is generated. Segmentation ambiguities are identified in training data. Rewrite rules for the ambiguous segmentations are enumerated and probabilities are generated for each. Ambiguities are resolved based on the probabilities. In one embodiment, this is done by applying the expectation maximization (EM) algorithm.
摘要翻译：生成基于规则的语法。在训练数据中识别分割模糊。枚举模糊分段的重写规则，并为每个生成概率。根据概率解决歧义。在一个实施例中，这通过应用期望最大化（EM）算法来完成。

87. 发明授权

US07254536B2 Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech 有权
标题翻译：使用校正和缩放矢量进行噪声降低的方法，其中噪声语音领域的声学空间分割
公开(公告)号：US07254536B2
公开(公告)日：2007-08-07
申请号：US11059036
申请日：2005-02-16
申请人： Li Deng , Xuedong Huang , Alejandro Acero
发明人： Li Deng , Xuedong Huang , Alejandro Acero
IPC分类号： G10L21/02
CPC分类号： G10L21/0208
摘要： A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.
摘要翻译：提供了一种用于减少训练信号和/或测试信号中的噪声的方法和装置。噪声降低技术使用由两个信道信号形成的立体声信号，每个信道包含相同的模式信号。一个通道信号是“干净的”，另一个包括加性噪声。使用来自这些信道信号的特征向量，确定噪声校正和缩放向量的集合。当稍后接收到噪声模式信号的特征向量时，将其乘以该特征向量的最佳缩放向量，并将最佳校正向量加到乘积以产生降噪特征向量。在一个实施例中，通过为噪声特征向量选择最佳混合分量来识别最佳缩放和校正矢量。基于与每个混合物组分相关联的噪声通道特征向量的分布来选择最佳混合物组分。

88. 发明申请

US20070150268A1 Spatial noise suppression for a microphone array 有权
标题翻译：麦克风阵列的空间噪声抑制
公开(公告)号：US20070150268A1
公开(公告)日：2007-06-28
申请号：US11316002
申请日：2005-12-22
申请人： Alejandro Acero , Ivan Tashev , Michael Seltzer
发明人： Alejandro Acero , Ivan Tashev , Michael Seltzer
IPC分类号： G10L21/02
CPC分类号： G10L21/0208 , G10L2021/02166
摘要： A microphone array having at least three microphones provides a captured signal. Spatial noise suppression estimates a desired signal from a captured signal using spatio-temporal distribution of the speech and the noise. In particular, spatial information indicative of at least two quantities of direction are used. A first quantity is based on a first combination of the signals from the at least three microphones, a second quantity is based on a second combination of the signals of the at least three microphones.
摘要翻译：具有至少三个麦克风的麦克风阵列提供捕获的信号。空间噪声抑制使用语音和噪声的时空分布从捕获的信号估计期望的信号。特别地，使用指示至少两个方向量的空间信息。第一数量是基于来自至少三个麦克风的信号的第一组合，第二数量是基于至少三个麦克风的信号的第二组合。

89. 发明申请

US20070106509A1 Indexing and searching speech with text meta-data 有权
标题翻译：用文本元数据索引和搜索语音
公开(公告)号：US20070106509A1
公开(公告)日：2007-05-10
申请号：US11269872
申请日：2005-11-08
申请人： Alejandro Acero , Ciprian Chelba , Jorge Sanchez
发明人： Alejandro Acero , Ciprian Chelba , Jorge Sanchez
IPC分类号： G10L15/00
CPC分类号： G06F17/30778 , G06F17/30746 , G06F17/30749 , G10L15/197
摘要： An index for searching spoken documents having speech data and text meta-data is created by obtaining probabilities of occurrence of words and positional information of the words of the speech data and combining it with at least positional information of the words in the text meta-data. A single index can be created because the speech data and the text meta-data are treated the same and considered only different categories.
摘要翻译：用于搜索具有语音数据和文本元数据的口头文档的索引是通过获得单词的发生概率和语音数据的单词的位置信息并将其与文本元数据中的单词的至少位置信息进行组合来创建的。可以创建单个索引，因为语音数据和文本元数据被视为相同，仅被认为是不同的类别。

90. 发明申请

US20070055492A1 Configurable grammar templates 审中-公开
标题翻译：可配置语法模板
公开(公告)号：US20070055492A1
公开(公告)日：2007-03-08
申请号：US11259475
申请日：2005-10-26
申请人： Ye-Yi Wang , Dong Yu , Yun-Cheng Ju , Alejandro Acero
发明人： Ye-Yi Wang , Dong Yu , Yun-Cheng Ju , Alejandro Acero
IPC分类号： G06F17/27
CPC分类号： G06F17/27 , G06F17/2247 , G06F17/248 , G10L15/193
摘要： To provide application developers with the ability to easily form customized grammars, grammar extensions are provided that allow application developers to selectively include portions of grammar templates and to easily combine grammar elements to form new grammar structures.
摘要翻译：为了使应用程序开发人员能够轻松构建自定义语法，提供了语法扩展，允许应用程序开发人员选择性地包括语法模板的一部分，并轻松组合语法元素以形成新的语法结构。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式