专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

WO2014158239A1 LANGUAGE MODELING OF COMPLETE LANGUAGE SEQUENCES 审中-公开
标题翻译：完整语言序列的语言建模
公开(公告)号：WO2014158239A1
公开(公告)日：2014-10-02
申请号：PCT/US2013/070732
申请日：2013-11-19
申请人： GOOGLE INC.
发明人： CHELBA, Ciprian, I. , SAK, Hasim , SCHALKWYK, Johan
IPC分类号： G10L15/06 , G10L15/197
CPC分类号： G10L15/063 , G10L15/197
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于完整语言序列的语言建模。访问指示语言序列的训练数据，并且确定训练数据中出现每个语言序列多次的计数。选择语言序列的适当子集，并训练语言模型的第一个组成部分。第一组件包括用于将分数分配给所选择的语言序列的第一概率数据。基于训练数据训练语言模型的第二组件，其中第二组件包括用于将分数分配给不包括在所选语言序列中的语言序列的第二概率数据。生成相对于第一概率数据归一化第二概率数据的调整数据，并且存储第一分量，第二分量和调整数据。

2. 发明申请

WO2014085049A1 SPEECH TRANSCRIPTION INCLUDING WRITTEN TEXT 审中-公开
标题翻译：语音转换，包括书面文字
公开(公告)号：WO2014085049A1
公开(公告)日：2014-06-05
申请号：PCT/US2013/068908
申请日：2013-11-07
申请人： GOOGLE INC.
发明人： SAK, Hasim , BEAUFAYS, Francoise
IPC分类号： G06F17/27 , G10L15/26 , G10L15/183
CPC分类号： G06F17/2775 , G10L15/083 , G10L15/187 , G10L15/197 , G10L15/26
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.
摘要翻译：公开了包括在计算机存储介质上编码的用于将话语转换成书面文本的计算机程序的方法，系统和装置。方法，系统和装置包括获取将电话映射到口语文本并获得将概率分配给书写文本的语言模型的词典模型的动作。还包括生成将书写文本映射到口语文本的传感器，换能器将多个文本文本项目映射到口语文本的项目。此外，这些动作包括通过组合词典模型，换能器的倒数和语言模型来构建用于将话语转录成书写文本的解码网络。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式