专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07089182B2 Method and apparatus for feature domain joint channel and additive noise compensation 有权
公开(公告)号：US07089182B2
公开(公告)日：2006-08-08
申请号：US10099305
申请日：2002-03-15
申请人： Younes Souilmi , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua
发明人： Younes Souilmi , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua
IPC分类号： G10L15/20
CPC分类号： G10L15/20 , G10L15/02 , G10L15/063 , G10L21/0216
摘要： A method for performing noise adaptation of a target speech signal input to a speech recognition system, where the target speech signal contains both additive and convolutional noises. The method includes estimating an additive noise bias and a convolutional noise bias; in the target speech signal; and jointly compensating the target speech signal for the additive and convolutional noise biases in a feature domain.

2. 发明申请

US20050038655A1 Bubble splitting for compact acoustic modeling 有权
标题翻译：气泡分裂用于紧凑的声学建模
公开(公告)号：US20050038655A1
公开(公告)日：2005-02-17
申请号：US10639974
申请日：2003-08-13
申请人： Ambroise Mutel , Patrick Nguyen , Luca Rigazio
发明人： Ambroise Mutel , Patrick Nguyen , Luca Rigazio
IPC分类号： G10L11/00 , G10L15/02 , G10L15/06 , G10L15/14
CPC分类号： G10L15/063 , G10L15/144 , G10L2015/0631 , G10L2015/0638
摘要： An improved method is provided for constructing compact acoustic models for use in a speech recognizer. The method includes: partitioning speech data from a plurality of training speakers according to at least one speech related criteria (i.e., vocal tract length); grouping together the partitioned speech data from training speakers having a similar speech characteristic; and training an acoustic bubble model for each group using the speech data within the group.
摘要翻译：提供了一种用于构建用于语音识别器中的紧凑声学模型的改进方法。该方法包括：根据至少一个语音相关标准（即，声道长度）来分割来自多个训练说话者的语音数据; 将具有类似语音特征的训练说话者的分割语音数据分组在一起; 并使用组内的语音数据为每个组训练声音气泡模型。

3. 发明授权

US07328154B2 Bubble splitting for compact acoustic modeling 有权
标题翻译：气泡分裂用于紧凑的声学建模
公开(公告)号：US07328154B2
公开(公告)日：2008-02-05
申请号：US10639974
申请日：2003-08-13
申请人： Ambroise Mutel , Patrick Nguyen , Luca Rigazio
发明人： Ambroise Mutel , Patrick Nguyen , Luca Rigazio
IPC分类号： G10L15/00
CPC分类号： G10L15/063 , G10L15/144 , G10L2015/0631 , G10L2015/0638
摘要： An improved method is provided for constructing compact acoustic models for use in a speech recognizer. The method includes: partitioning speech data from a plurality of training speakers according to at least one speech related criteria (i.e., vocal tract length); grouping together the partitioned speech data from training speakers having a similar speech characteristic; and training an acoustic bubble model for each group using the speech data within the group.
摘要翻译：提供了一种用于构建用于语音识别器中的紧凑声学模型的改进方法。该方法包括：根据至少一个语音相关标准（即，声道长度）来分割来自多个训练说话者的语音数据; 将具有类似语音特征的训练说话者的分割语音数据分组在一起; 并使用组内的语音数据为每个组训练声音气泡模型。

4. 发明授权

US06901364B2 Focused language models for improved speech input of structured documents 有权
标题翻译：用于改进结构化文档语音输入的专注语言模型
公开(公告)号：US06901364B2
公开(公告)日：2005-05-31
申请号：US09951093
申请日：2001-09-13
申请人： Patrick Nguyen , Luca Rigazio , Jean-Claude Junqua
发明人： Patrick Nguyen , Luca Rigazio , Jean-Claude Junqua
IPC分类号： G10L15/18 , G10L15/28 , G10L15/26 , G06F17/20 , G10L21/00
CPC分类号： G10L15/1815 , G10L15/30
摘要： An e-mail message process is provided for use with a personal digital assistant which allows for the use of input speech messaging which is converted to text using a focused language model which is downloaded by a cellular phone connection to an Internet server which provides the focused language model based upon a topic for the intended e-mail message. The text that is generated from the input speech method can be summarized by the e-mail message processor and can be edited by the user. The generated e-mail message can then be transmitted again via cellular connection to an Internet e-mail server for transmitting the e-mail message to a recipient.
摘要翻译：提供电子邮件消息处理以与个人数字助理一起使用，该个人数字助理允许使用输入语音消息传送，其使用由通过蜂窝电话连接下载的聚焦语言模型转换为文本，该互联网服务器提供聚焦基于预期电子邮件的主题的语言模型。从输入语音方法生成的文本可以由电子邮件消息处理器来总结，并且可以由用户编辑。然后可以通过蜂窝连接再次将生成的电子邮件消息发送到Internet电子邮件服务器，以将电子邮件消息发送给接收者。

5. 发明授权

US06879954B2 Pattern matching for large vocabulary speech recognition systems 有权
标题翻译：大词汇语音识别系统的模式匹配
公开(公告)号：US06879954B2
公开(公告)日：2005-04-12
申请号：US10127184
申请日：2002-04-22
申请人： Patrick Nguyen , Luca Rigazio
发明人： Patrick Nguyen , Luca Rigazio
IPC分类号： G10L15/08 , G10L15/10 , G10L15/28 , G10L15/00 , G06F15/76
CPC分类号： G10L15/08 , G10L15/10 , G10L15/285 , G10L15/30 , G10L15/34
摘要： A method is provided for improving pattern matching in a speech recognition system having a plurality of acoustic models. The improved method includes: receiving continuous speech input; generating a sequence of acoustic feature vectors that represent temporal and spectral behavior of the speech input; loading a first group of acoustic feature vectors from the sequence of acoustic feature vectors into a memory workspace accessible to a processor; loading an acoustic model from the plurality of acoustic models into the memory workspace; and determining a similarity measure for each acoustic feature vector of the first group of acoustic feature vectors in relation to the acoustic model. Prior to retrieving another group of acoustic feature vectors, similarity measures are computed for the first group of acoustic feature vectors in relation to each of the acoustic models employed by the speech recognition system. In this way, the improved method reduces the number I/O operations associated with loading and unloading each acoustic model into memory.
摘要翻译：提供了一种用于改进具有多个声学模型的语音识别系统中的模式匹配的方法。改进的方法包括：接收连续语音输入; 产生表示语音输入的时间和频谱行为的声学特征向量序列; 将来自声学特征向量序列的第一组声学特征向量加载到可由处理器访问的存储器工作空间; 将来自所述多个声学模型的声学模型加载到所述存储器工作空间中; 以及针对声学模型确定第一组声学特征向量的每个声学特征向量的相似性度量。在检索另一组声学特征向量之前，相对于由语音识别系统采用的每个声学模型，针对第一组声学特征向量计算相似性度量。以这种方式，改进的方法减少了将每个声学模型加载和卸载到存储器中的数量I / O操作。

6. 发明授权

US07035802B1 Recognition system using lexical trees 失效
标题翻译：识别系统使用词汇树
公开(公告)号：US07035802B1
公开(公告)日：2006-04-25
申请号：US09628828
申请日：2000-07-31
申请人： Luca Rigazio , Patrick Nguyen
发明人： Luca Rigazio , Patrick Nguyen
IPC分类号： G10L15/14
CPC分类号： G06F17/30625 , G10L15/08
摘要： The dynamic programming technique employs a lexical tree that is encoded in computer memory as a flat representation in which the nodes of each generation occupy contiguous memory locations. The traversal algorithm employs a set of traversal rules whereby nodes of a given generation are processed before the parent nodes of that generation. The deepest child generation is processed first and traversal among nodes of each generation proceeds in the same topological direction.
摘要翻译：动态编程技术采用在计算机存储器中编码的词法树作为平面表示，其中每一代的节点占用连续的存储器位置。遍历算法使用一组遍历规则，从而给定代的节点在该代的父节点之前被处理。首先处理最深的子代，并且遍历每一代的节点在相同的拓扑方向上进行。

7. 发明申请

US20050075881A1 Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing 有权
标题翻译：语音标记，语音注释和可选后置处理的便携式设备的语音识别
公开(公告)号：US20050075881A1
公开(公告)日：2005-04-07
申请号：US10677174
申请日：2003-10-02
申请人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua
发明人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua
IPC分类号： G10L15/26 , G10L21/00
CPC分类号： G06F17/30796 , G10L15/26
摘要： A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.
摘要翻译：媒体捕获设备具有接收与媒体捕获活动紧密相关的媒体捕获活动的用户语音的音频输入。分别与媒体捕获活动相关的多个聚焦语音识别词典被存储在设备上，并且语音识别器基于所选择的一个焦点语音识别词典识别用户语音。媒体标签器使用生成的语音识别文本来标记捕获的媒体，并且媒体注释器用适合于输入到语音识别器的用户语音的样本来注释所捕获的媒体。标记和注释是基于用户语音的接收和捕获的媒体的捕获之间的紧密的时间关系。在后期处理中，注释可以转换为标签，用于使用字母对声音规则和拼写单词输入来编辑词典，或直接与语音匹配以检索所捕获的媒体。

8. 发明授权

US07324943B2 Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing 有权
标题翻译：语音标记，语音注释和可选后置处理的便携式设备的语音识别
公开(公告)号：US07324943B2
公开(公告)日：2008-01-29
申请号：US10677174
申请日：2003-10-02
申请人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua
发明人： Luca Rigazio , Robert Boman , Patrick Nguyen , Jean-Claude Junqua
IPC分类号： G10L21/00 , H04N5/76
CPC分类号： G06F17/30796 , G10L15/26
摘要： A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.
摘要翻译：媒体捕获设备具有接收与媒体捕获活动紧密相关的媒体捕获活动的用户语音的音频输入。分别与媒体捕获活动相关的多个聚焦语音识别词典被存储在设备上，并且语音识别器基于所选择的一个焦点语音识别词典识别用户语音。媒体标签器使用生成的语音识别文本来标记捕获的媒体，并且媒体注释器用适合于输入到语音识别器的用户语音的样本来注释所捕获的媒体。标记和注释是基于用户语音的接收和捕获的媒体的捕获之间的紧密的时间关系。在后期处理中，注释可以转换为标签，用于使用字母对声音规则和拼写单词输入来编辑词典，或直接与语音匹配以检索所捕获的媒体。

9. 发明申请

US20050010411A1 Speech data mining for call center management 审中-公开
标题翻译：语音数据挖掘用于呼叫中心管理
公开(公告)号：US20050010411A1
公开(公告)日：2005-01-13
申请号：US10616006
申请日：2003-07-09
申请人： Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua , Robert Boman
发明人： Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua , Robert Boman
IPC分类号： G10L15/26 , G10L17/00 , G10L15/00
CPC分类号： G10L15/26 , G10L17/00
摘要： A speech data mining system for use in generating a rich transcription having utility in call center management includes a speech differentiation module differentiating between speech of interacting speakers, and a speech recognition module improving automatic recognition of speech of one speaker based on interaction with another speaker employed as a reference speaker. A transcript generation module generates a rich transcript based on recognized speech of the speakers. Focused, interactive language models improve recognition of a customer on a low quality channel using context extracted from speech of a call center operator on a high quality channel with a speech model adapted to the operator. Mined speech data includes number of interaction turns, customer frustration phrases, operator polity, interruptions, and/or contexts extracted from speech recognition results, such as topics, complaints, solutions, and resolutions. Mined speech data is useful in call center and/or product or service quality management.
摘要翻译：用于产生在呼叫中心管理中具有效用的丰富录音的语音数据挖掘系统包括区分交互式扬声器的语音的语音区分模块和改善一个扬声器的语音的自动识别的语音识别模块，作为参考发言人。转录本生成模块基于扬声器的识别语音生成丰富的录音。专注的交互式语言模型通过使用适合于操作员的语音模型，在高质量频道上从呼叫中心运营商的语音提取的上下文，改善对低质量信道上客户的识别。挖掘的语音数据包括从诸如主题，投诉，解决方案和分辨率的语音识别结果中提取的交互轮廓数量，客户沮丧短语，运营商政治，中断和/或上下文。挖掘的语音数据在呼叫中心和/或产品或服务质量管理中是有用的。

10. 发明授权

US06687672B2 Methods and apparatus for blind channel estimation based upon speech correlation structure 有权
标题翻译：基于语音相关结构的盲信道估计方法与装置
公开(公告)号：US06687672B2
公开(公告)日：2004-02-03
申请号：US10099428
申请日：2002-03-15
申请人： Younes Souilmi , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua
发明人： Younes Souilmi , Luca Rigazio , Patrick Nguyen , Jean-Claude Junqua
IPC分类号： G10L1508
CPC分类号： G10L21/0208
摘要： Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.
摘要翻译：提供了由通信信道损坏的语音信号的盲信道估计的方法和装置。一种方法包括将噪声语音信号转换成倒谱表示或对数谱表示; 估计噪声语音信号的表示的相关性; 确定噪声语音信号的平均值; 利用最小化约束，构建和求解利用清晰语音训练信号的相关结构，噪声语音信号的表示与噪声语音信号的平均值的相关性的线性方程组; 以及选择线性方程式的解的符号来估计处理窗口中的平均清洁语音信号。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式