会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies
    • 用于具有大词汇的自动语音识别的声学和语言建模的系统和方法
    • US07801727B2
    • 2010-09-21
    • US11064643
    • 2005-02-24
    • Ponani GopalakrishnanDimitri KanevskyMichael Daniel MonkowskiJan Sedivy
    • Ponani GopalakrishnanDimitri KanevskyMichael Daniel MonkowskiJan Sedivy
    • G10L15/04
    • G10L15/197G06F17/27G10L15/183Y10S707/99942
    • A method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms is disclosed. The method includes: partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms; and in at least one of the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components. Also disclosed is a method for use in speech recognition including: splitting an acoustic vocabulary comprising baseforms into baseform components and storing the baseform components; and, performing sound to spelling mapping on the baseform components so as to generate a baseform components to word parts table for use in subsequent decoding of speech. A method for decoding a speech utterance using language model components and acoustic components, includes the steps of: generating from the utterance a stack of baseform component paths; concatenating baseform components in a path to generate concatenated baseforms, when the concatenated baseform components correspond to a baseform found in an acoustic vocabulary; mapping the concatenated baseforms into words; computing language model (LM) scores associated with the words using a language model, and performing further decoding of the utterance based thereupon.
    • 公开了一种用于生成具有多个单词形式的语言词汇V的语音识别系统的语言组件词汇VC的方法。 该方法包括:基于各个词形式的出现频率将语言词汇V划分成单词形式的子集; 并且在至少一个子集中,分割具有小于阈值的频率的字形式,从而生成词形分量。 还公开了一种用于语音识别的方法,包括:将包含基本形式的声学词汇分解成基本形式组件并存储基本形式组件; 并且对基本形式组件执行声音拼写映射,以便生成用于语音后续解码中的字部分表的基本形式分量。 一种使用语言模型分量和声学分量对语音发音进行解码的方法,包括以下步骤:从发音中产生一叠基础分量路径; 当级联的基本形式组件对应于在声学词汇中发现的基础形式时,将路径中的基本形式组件连接以生成级联的基本形式; 将连接的基本形式映射为单词; 与使用语言模型的单词相关联的计算语言模型(LM)得分,并且基于此进行对话语的进一步解码。