会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明公开
    • Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system
    • 的方法和装置的自动确定的音韵规则用于连续语音识别的系统。
    • EP0387602A2
    • 1990-09-19
    • EP90103824.0
    • 1990-02-27
    • International Business Machines Corporation
    • Bahl, Lalit RaiBrown, Peter FitzhughDesouza, Peter VincentMercer, Robert Leroy
    • G10L5/06
    • G10L15/14
    • A continuous speech recognition system includes an automatic phonological rules generator which determines variations in the pronunciation of phonemes based on the context in which they occur. This phonological rules generator associates sequences of labels derived from vocalizations of a training text with respective phonemes inferred from the training text. These sequences are then annotated with their phoneme context from the training text and clustered into groups representing similar pronunciations of each phoneme. A decision tree is generated using the context information of the sequences to predict the clusters to which the sequences belong. The training data is processed by the decision tree to divide the sequences into leaf-groups representing similar pronunciations of each phoneme. The sequences in each leaf-group are clustered into sub-groups represent­ing respectively different pronunciations of their corresponding phoneme in a give context. A Markov model is generated for each sub-group. The various Markov models of a leaf-group are combined into a single compound model by assigning common initial and final states to each model. The compound Markov models are used by a speech recognition system to analyze an unknown sequence of labels given its context.
    • 连续语音识别系统包括自动语音规则生成了基于它们发生的情境中音素的发音bestimmt变化。 这个语音规则生成关联从与从所述文本锻炼推断respectivement音素锻炼文本的发声衍生标签的序列。 然后,将这些序列进行注释利用来自锻炼他们的文本音素上下文和聚集成表示每个音素的发音相似的基团。 决策树使用序列的上下文中的信息来预测到的序列属于簇生成。 锻炼数据由决策树处理划分的序列转换成表示每个音素的发音相似叶的基团。 每一叶组中的序列被聚集成表示在一个给上下文它们相应的音素的各个不同的发音子组。 对于每个子组生成一个马尔可夫模型。 叶组的各个马尔可夫模型被组合成由每个模型分配公共初始和最终状态的单一化合物的模型。 化合物马尔可夫模型用于通过语音识别系统来分析由于其情境标签的未知序列。
    • 3. 发明公开
    • Speech recognition method and system with efficient storage and rapid assembly of phonological graphs
    • 用于与有效存储和语音表征的快速组装的语音识别方法和装置。
    • EP0238692A1
    • 1987-09-30
    • EP86104219.0
    • 1986-03-27
    • International Business Machines Corporation
    • Bahl, Lalit RaiCohen, Paul SheldonMercer, Robert Leroy
    • G10L5/06
    • G10L15/187G10L15/08G10L15/14
    • A continuous speech recognition system is disclosed having a speech processor and a word recognition computer subsystem. The system has means associated with the speech processor for developing a graph of confluent links between confluent nodes; means associated with the speech processor for developing a graph of boundary links between adjacent words; means associated with the speech processor for storing an inventory of confluent links and boundary links as a coding inventory; means associated with the speech processor for converting an unknown utterance into an encoded sequence of confluent links and boundary links corresponding to recognition sequences stored in said word recognition vocabulary for speech recognition. The disclosure also includes a method for achieving continuous speech recognition by characterizing speech as a sequence of confluent links which are matched with candidate words. The invention applies to isolated word speech recognition as well as to continuous speech recognition, except that in such case there are no boundary links.
    • 连续语音识别系统是游离缺失盘具有语音处理器和一个字识别计算机子系统。 该系统具有用于显影汇合节点之间的汇合链路的曲线图中的语音处理器相关联的装置; 装置,用于开发边界的曲线图相邻单词之间留下的语音处理器相关联; 装置与所述语音处理器相关联,用于汇合的库存的存储左和左边界作为编码清单; 左和左边界对应于存储在所述字识别词汇语音识别的识别序列是指与所述语音处理器相关联的变换为未知发声到汇合的编码在序列。 因此,本公开包括用于通过表征语音作为其与候选词匹配汇合左的序列实现连续语音识别的方法。 本发明适用于孤立词语音识别以及连续语音识别,除了在寻找的情况下有没有边界左边一样。
    • 4. 发明公开
    • Method for performing acoustic matching in a speech recognition system
    • Verfahren zum akustischen Vergleichen在einem Spracherkennungssystem。
    • EP0238689A1
    • 1987-09-30
    • EP86104216.6
    • 1986-03-27
    • International Business Machines Corporation
    • Bahl, Lalit RaiMercer, Robert LeroyDegennaro, Steven Vincent
    • G10L5/06
    • G10L15/14
    • Method and apparatus for statistically determining in a vocabulary of words at least one word which has a relatively high probability of having generated a string of incoming labels produced in response to a speech input. Each word is represented as a sequence of phonetic elements (including conventional-phones) wherein the statistical representations of the phonetic elements are simplified by approximations. In a first approximation, the actual probabilities that a given phonetic element generates a given label (at any of various transitions t i between states S j in the given phonetic element) are replaced by a single specific value which is no less than the maximum actual probability for the given label in the given phonetic element. A second approximation provides that the probability of a given phonetic element to generate any length of labels between a minimum and a maximum is uniform. A third approximation provides that only a maximum number of labels be examined in forming a match value for any phonetic element.
    • 用于在单词的词汇中统计确定至少一个单词的方法和装置,其具有响应于语音输入而产生的输入标签串的相对较高概率。 每个单词被表示为语音元素(包括常规电话)的序列,其中通过近似来简化语音元素的统计表示。 在第一近似中,给定语音元素产生给定标签(在给定语音元素中的状态Sj之间的各种转变ti中的任何一个)的实际概率被单个特定值代替,该特定值不小于 给定的音标元素中的给定标签。 第二个近似规定给定语音元素在最小和最大值之间产生任何长度的标签的概率是均匀的。 第三个近似规定,在形成任何语音元素的匹配值时,仅检查最大数量的标签。
    • 9. 发明公开
    • Synthesizing word baseforms used in speech recognition
    • Erzeugung von Wortgrundstrukturen zur Spracherkennung。
    • EP0241768A2
    • 1987-10-21
    • EP87104309.7
    • 1987-03-24
    • International Business Machines Corporation
    • Bahl, Lalit RaiDesouza, Peter VincentMercer, Robert LeroyPicheny, Michael Alan
    • G10L5/06
    • G10L15/14
    • Apparatus and method for synthesizing word baseforms for words not spoken during a training session, wherein each synthesized baseform represents a series of models from a first set of models, which include: (a) uttering speech during a training session and representing the uttered speech as a sequence of models from a second set of models; (b) for each of at least some of the second set models spoken in a given phonetic model context during the training session, storing a respective string of first set models; and (c) constructing a word baseform of first set models for a word not spoken during the training session, including the step of representing each piece of a word that corresponds to a second set model in a given context by the stored respective string, if any, corresponding thereto.
    • 用于合成在训练期间未说出的词语的词基形式的装置和方法,其中每个合成的基本形式表示来自第一组模型的一系列模型,其包括:(a)在训练期间发出语音并将发出的语音表示为 来自第二组模型的一系列模型; (b)对于训练期间在给定语音模型上下文中说出的至少一些第二组模型中的每一个,存储相应的第一组模型串; 以及(c)在训练会话期间为未被说出的单词构造第一集合模型的单词基本形式,包括在给定上下文中通过存储的相应字符串表示对应于第二集合模型的单词的每一段的步骤,如果 任何对应的。
    • 10. 发明公开
    • Method and apparatus for generating word model baseforms for speech recognition
    • Verfahren und Einrichtung zur Erzeugung von WortmusternfürSpracherkennung。
    • EP0238698A1
    • 1987-09-30
    • EP86104226.5
    • 1986-03-27
    • International Business Machines Corporation
    • Bahl, Lalit RaiDesouza,Peter VincentMercer, Robert LeroyPicheny, Michael Alan
    • G10L5/06
    • G10L15/04
    • The present invention relates to method and apparatus for constructing word baseforms which can be matched against a string of generated acoustic labels which in­cludes: forming a set of phonetic phone machines, wherein each phone machine has (i) a plurality of states, (ii) a plurality of transitions each of which extends from a state to a state, (iii) a stored proba­bility for each transition, and (iv) stored label output probabilities, each label output probability corre­sponding to the probability that the respective phone machine produces a corresponding label; wherein said set of phonetic machines is formed to include a subset of onset phone machines, the stored probabilities of each onset phone machine corresponding to at least one pho­netic element being uttered at the beginning of a speech segment; and wherein said set of phonetic machines is formed to include a subset of trailing phone machines, the stored probabilities of each trailing phone machine corresponding to at least one single phonetic element being uttered at the end of a speech segment. Word baseforms are constructed by concatenating phone ma­chines selected from the set.
    • 本发明涉及一种用于构建可与一串生成的声学标签匹配的字基形式的方法和装置,其包括:形成一组语音电话机,其中每个电话机具有(i)多个状态,(ii) 多个转换,每个转换从状态延伸到状态,(iii)存储的每个转换的概率,以及(iv)存储的标签输出概率,每个标签输出概率对应于各个电话机产生对应的概率 标签; 其中所述语音机组形成为包括起始电话机的子集,每个发起电话机的存储概率对应于至少一个语音元素在语音段开始时发出; 并且其中所述语音机组被形成为包括拖尾电话机的子集,每个拖尾电话机的存储概率对应于在语音段结束时发出的至少一个单个语音元素。 字基础是通过连接从集合中选择的电话机构成的。