专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08959014B2 Training acoustic models using distributed computing techniques 有权
标题翻译：使用分布式计算技术训练声学模型
公开(公告)号：US08959014B2
公开(公告)日：2015-02-17
申请号：US13539225
申请日：2012-06-29
申请人： Peng Xu , Fernando Pereira , Ciprian I. Chelba
发明人： Peng Xu , Fernando Pereira , Ciprian I. Chelba
IPC分类号： G06F17/21 , G06F17/20 , G10L15/08 , G10L15/14 , G10L15/187 , G10L15/04 , G10L15/34
CPC分类号： G10L15/187 , G10L15/063 , G10L15/14 , G10L15/34 , G10L2015/0631
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models. Speech data and data identifying a transcription for the speech data are received. A phonetic representation for the transcription is accessed. Training sequences are identified for a particular phone in the phonetic representation. Each of the training sequences includes a different set of contextual phones surrounding the particular phone. A partitioning key is identified based on a sequence of phones that occurs in each of the training sequences. A processing module to which the identified partitioning key is assigned is selected. Data identifying the training sequences and a portion of the speech data are transmitted to the selected processing module.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的用于训练声学模型的计算机程序。接收用于识别语音数据的转录的语音数据和数据。访问转录的语音表示。在语音表示中为特定电话识别训练序列。每个训练序列包括围绕特定电话的不同的上下文电话组。基于在每个训练序列中出现的电话序列来识别分区密钥。选择分配了所识别的分区键的处理模块。识别训练序列和语音数据的一部分的数据被发送到所选择的处理模块。

2. 发明申请

US20130006623A1 SPEECH RECOGNITION USING VARIABLE-LENGTH CONTEXT 有权
标题翻译：使用可变长度语境进行语音识别
公开(公告)号：US20130006623A1
公开(公告)日：2013-01-03
申请号：US13539284
申请日：2012-06-29
申请人： Ciprian I. Chelba , Peng Xu , Fernando Pereira
发明人： Ciprian I. Chelba , Peng Xu , Fernando Pereira
IPC分类号： G10L15/20
CPC分类号： G10L15/187 , G10L15/063 , G10L15/14 , G10L15/34 , G10L2015/0631
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech using a variable length of context. Speech data and data identifying a candidate transcription for the speech data are received. A phonetic representation for the candidate transcription is accessed. Multiple test sequences are extracted for a particular phone in the phonetic representation. Each of the multiple test sequences includes a different set of contextual phones surrounding the particular phone. Data indicating that an acoustic model includes data corresponding to one or more of the multiple test sequences is received. From among the one or more test sequences, the test sequence that includes the highest number of contextual phones is selected. A score for the candidate transcription is generated based on the data from the acoustic model that corresponds to the selected test sequence.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于使用可变长度的上下文识别语音。接收用于识别语音数据的候选转录的语音数据和数据。访问候选转录的语音表示。在语音表示中为特定电话提取多个测试序列。多个测试序列中的每一个包括围绕特定电话的不同的上下文电话组。指示声学模型包括与多个测试序列中的一个或多个对应的数据的数据被接收。从一个或多个测试序列中，选择包括最多数量的上下文电话的测试序列。基于来自对应于所选择的测试序列的声学模型的数据生成候选转录的得分。

3. 发明授权

US08494850B2 Speech recognition using variable-length context 有权
标题翻译：使用可变长度上下文的语音识别
公开(公告)号：US08494850B2
公开(公告)日：2013-07-23
申请号：US13539284
申请日：2012-06-29
申请人： Ciprian I. Chelba , Peng Xu , Fernando Pereira
发明人： Ciprian I. Chelba , Peng Xu , Fernando Pereira
IPC分类号： G10L15/20
CPC分类号： G10L15/187 , G10L15/063 , G10L15/14 , G10L15/34 , G10L2015/0631
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech using a variable length of context. Speech data and data identifying a candidate transcription for the speech data are received. A phonetic representation for the candidate transcription is accessed. Multiple test sequences are extracted for a particular phone in the phonetic representation. Each of the multiple test sequences includes a different set of contextual phones surrounding the particular phone. Data indicating that an acoustic model includes data corresponding to one or more of the multiple test sequences is received. From among the one or more test sequences, the test sequence that includes the highest number of contextual phones is selected. A score for the candidate transcription is generated based on the data from the acoustic model that corresponds to the selected test sequence.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于使用可变长度的上下文识别语音。接收用于识别语音数据的候选转录的语音数据和数据。访问候选转录的语音表示。在语音表示中为特定电话提取多个测试序列。多个测试序列中的每一个包括围绕特定电话的不同的上下文电话组。指示声学模型包括与多个测试序列中的一个或多个对应的数据的数据被接收。从一个或多个测试序列中，选择包括最多数量的上下文电话的测试序列。基于来自对应于所选择的测试序列的声学模型的数据生成候选转录的得分。

4. 发明申请

US20130006612A1 TRAINING ACOUSTIC MODELS 有权
标题翻译：训练声学模型
公开(公告)号：US20130006612A1
公开(公告)日：2013-01-03
申请号：US13539225
申请日：2012-06-29
申请人： Peng Xu , Fernando Pereira , Ciprian I. Chelba
发明人： Peng Xu , Fernando Pereira , Ciprian I. Chelba
IPC分类号： G06F17/27
CPC分类号： G10L15/187 , G10L15/063 , G10L15/14 , G10L15/34 , G10L2015/0631
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models. Speech data and data identifying a transcription for the speech data are received. A phonetic representation for the transcription is accessed. Training sequences are identified for a particular phone in the phonetic representation. Each of the training sequences includes a different set of contextual phones surrounding the particular phone. A partitioning key is identified based on a sequence of phones that occurs in each of the training sequences. A processing module to which the identified partitioning key is assigned is selected. Data identifying the training sequences and a portion of the speech data are transmitted to the selected processing module.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的用于训练声学模型的计算机程序。接收用于识别语音数据的转录的语音数据和数据。访问转录的语音表示。在语音表示中为特定电话识别训练序列。每个训练序列包括围绕特定电话的不同的上下文电话组。基于在每个训练序列中出现的电话序列来识别分区密钥。选择分配了所识别的分区键的处理模块。识别训练序列和语音数据的一部分的数据被发送到所选择的处理模块。

5. 发明授权

US08515746B1 Selecting speech data for speech recognition vocabulary 有权
标题翻译：选择语音识别词汇的语音数据
公开(公告)号：US08515746B1
公开(公告)日：2013-08-20
申请号：US13593910
申请日：2012-08-24
申请人： Maryam Garrett , Ciprian I. Chelba
发明人： Maryam Garrett , Ciprian I. Chelba
IPC分类号： G10L15/00 , G10L15/04 , G06F17/30
CPC分类号： G06F17/30976 , G06F17/2735 , G10L15/01 , G10L15/063
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting training data. In an aspect, a method comprises: selecting a target out of vocabulary rate; selecting a target percentage of user sessions; and determining a minimum training data collection duration for a vocabulary of words, the minimum training data collection duration corresponding to the target percentage of user sessions experiencing the target out of vocabulary rate.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于选择训练数据。一方面，一种方法包括：从词汇率中选择一个目标; 选择用户会话的目标百分比; 以及确定用于词汇词汇的最小训练数据收集持续时间，所述最小训练数据收集持续时间对应于经历所述目标超出词汇率的用户会话的目标百分比。

6. 发明授权

US08990692B2 Time-marked hyperlinking to video content 有权
标题翻译：时间标记的超链接到视频内容
公开(公告)号：US08990692B2
公开(公告)日：2015-03-24
申请号：US12412112
申请日：2009-03-26
申请人： Ciprian I. Chelba
发明人： Ciprian I. Chelba
IPC分类号： G06F3/00 , H04N5/445 , H04N5/765 , H04N21/2743 , H04N21/472 , H04N21/4788 , H04N21/845
CPC分类号： G06F3/0484 , G06F3/0481 , G06F3/0482 , G06F3/167 , G06F17/30424 , G10L15/08 , H04N5/44591 , H04N5/765 , H04N21/2743 , H04N21/47214 , H04N21/47217 , H04N21/4788 , H04N21/8455
摘要： In one example, a method includes: receiving from a first user interface a first input from a first user specifying a first particular instant in a video other than a beginning of the video; in response to the first input, generating by one or more computer systems first data for inclusion in a link to the video, the first data representing the first particular instant in the video and being operable automatically to direct playback of the video at a second user interface to start at the first particular instant in the video in response to a second user selecting the link at the second user interface; and communicating the first data to a link generator for inclusion in the link to the video.
摘要翻译：在一个示例中，一种方法包括：从第一用户界面接收第一用户的第一输入，指定视频以外的视频中的第一特定时刻; 响应于第一输入，由一个或多个计算机系统生成用于包括在视频的链接中的第一数据，第一数据表示视频中的第一特定时刻，并且可自动地操作以直接在第二用户播放视频接口，响应于第二用户选择第二用户界面处的链接，在视频中的第一特定时刻开始; 以及将所述第一数据传送到链接生成器以包括在所述视频的链接中。

7. 发明授权

US08521523B1 Selecting speech data for speech recognition vocabulary 有权
标题翻译：选择语音识别词汇的语音数据
公开(公告)号：US08521523B1
公开(公告)日：2013-08-27
申请号：US13593909
申请日：2012-08-24
申请人： Maryam Garrett , Ciprian I. Chelba
发明人： Maryam Garrett , Ciprian I. Chelba
IPC分类号： G10L15/00 , G10L15/04 , G06F17/30
CPC分类号： G06F17/30976 , G06F17/2735 , G10L15/01 , G10L15/063
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting training data. In one aspect, a method comprises: selecting a target out of vocabulary rate; selecting a target percentage of user sessions; and determining a minimum training data freshness for a vocabulary of words, the minimum training data freshness corresponding to the target percentage of user sessions experiencing the target out of vocabulary rate.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于选择训练数据。一方面，一种方法包括：从词汇率中选择一个目标; 选择用户会话的目标百分比; 以及确定用于词汇词汇的最小训练数据新鲜度，所述最小训练数据新鲜度对应于经历所述目标超出词汇率的用户会话的目标百分比。

8. 发明授权

US07860314B2 Adaptation of exponential models 有权
标题翻译：指数模型的适应
公开(公告)号：US07860314B2
公开(公告)日：2010-12-28
申请号：US10977871
申请日：2004-10-29
申请人： Ciprian I. Chelba , Alejandro Acero
发明人： Ciprian I. Chelba , Alejandro Acero
IPC分类号： G06K9/00
CPC分类号： G06F17/273 , G06K9/6297
摘要： A method and apparatus are provided for adapting an exponential probability model. In a first stage, a general-purpose background model is built from background data by determining a set of model parameters for the probability model based on a set of background data. The background model parameters are then used to define a prior model for the parameters of an adapted probability model that is adapted and more specific to an adaptation data set of interest. The adaptation data set is generally of much smaller size than the background data set. A second set of model parameters are then determined for the adapted probability model based on the set of adaptation data and the prior model.
摘要翻译：提供了一种适应指数概率模型的方法和装置。在第一阶段，通过基于一组背景数据确定概率模型的一组模型参数，从背景数据构建通用背景模型。背景模型参数然后用于定义适应性概率模型的参数的先验模型，其适应并且更具体于感兴趣的自适应数据集。自适应数据集通常比背景数据集小得多的大小。然后，基于适配数据集和先验模型，针对适应概率模型确定第二组模型参数。

9. 发明授权

US08515745B1 Selecting speech data for speech recognition vocabulary 有权
标题翻译：选择语音识别词汇的语音数据
公开(公告)号：US08515745B1
公开(公告)日：2013-08-20
申请号：US13593703
申请日：2012-08-24
申请人： Maryam Garrett , Ciprian I. Chelba
发明人： Maryam Garrett , Ciprian I. Chelba
IPC分类号： G10L15/00 , G10L15/04 , G06F17/30
CPC分类号： G06F17/30976 , G06F17/2735 , G10L15/01 , G10L15/063
摘要： Methods, systems, and apparatus for selecting training data. In an aspect, a method comprises: obtaining search session data comprising search sessions that include search queries, wherein each search query comprises words; determining a threshold out of vocabulary rate indicating a rate at which a word in a search query is not included in a vocabulary; determining a threshold session out of vocabulary rate, the session out of vocabulary rate indicating a rate at which search sessions have an out of vocabulary rate that meets the threshold out of vocabulary rate; selecting a vocabulary of words that, for a set of test data, has a session out of vocabulary rate that meets the threshold session out of vocabulary rate, the vocabulary of words being selected from the one or more words included in each of the search queries included in the search sessions.
摘要翻译：用于选择训练数据的方法，系统和装置。一方面，一种方法包括：获得搜索会话数据，其包括包括搜索查询的搜索会话，其中每个搜索查询包括单词; 确定表示搜索查询中的单词不包括在词汇中的速率的词汇率的阈值; 从词汇率确定阈值会话，会话中的词汇率表示搜索会话具有超出词汇率的符合阈值的词汇率的速率; 选择词汇的词汇，对于一组测试数据，具有超出词汇率的符合阈值会话的词汇率的会话，从包括在每个搜索查询中的一个或多个单词中选择单词的词汇表包含在搜索会话中。

10. 发明授权

US07809568B2 Indexing and searching speech with text meta-data 有权
标题翻译：用文本元数据索引和搜索语音
公开(公告)号：US07809568B2
公开(公告)日：2010-10-05
申请号：US11269872
申请日：2005-11-08
申请人： Alejandro Acero , Ciprian I. Chelba , Jorge F. Silva Sanchez
发明人： Alejandro Acero , Ciprian I. Chelba , Jorge F. Silva Sanchez
IPC分类号： G10L15/00 , G06F7/00
CPC分类号： G06F17/30778 , G06F17/30746 , G06F17/30749 , G10L15/197
摘要： An index for searching spoken documents having speech data and text meta-data is created by obtaining probabilities of occurrence of words and positional information of the words of the speech data and combining it with at least positional information of the words in the text meta-data. A single index can be created because the speech data and the text meta-data are treated the same and considered only different categories.
摘要翻译：用于搜索具有语音数据和文本元数据的口头文档的索引是通过获得单词的发生概率和语音数据的单词的位置信息并将其与文本元数据中的单词的至少位置信息进行组合来创建的。可以创建单个索引，因为语音数据和文本元数据被视为相同，仅被认为是不同的类别。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式