会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Single-count backing-off method of determining N-gram language model
values
    • 确定N-gram语言模型值的单次备份方法
    • US5745876A
    • 1998-04-28
    • US642012
    • 1996-05-02
    • Reinhard KneserHermann Ney
    • Reinhard KneserHermann Ney
    • G10L15/197G10L5/06
    • G10L15/197
    • For the recognition of coherently spoken speech with a large vocabulary, language model values which take into account the probability of word sequences are considered at word transitions. Prior to the recognition, these language model values are derived on the basis of training speech signals. If the amount of training data is kept within sensible limits, not all word sequences will actually occur, so that the language model values for, for example an N-gram language model must be determined from word sequences of N-1 words actually occurring. In accordance with the invention, these reduced word sequences from each different, complete word sequence are counted only once, irrespective of the actual frequency of occurrence of the complete word sequence or only reduced training sequences which occur exactly once in the training data are taken into account.
    • 为了识别具有较大词汇量的相干语音,考虑到字序列的概率的语言模型值在词转换中被考虑。 在识别之前,这些语言模型值是基于训练语音信号导出的。 如果训练数据的数量保持在明显的限度内,并不是所有的字序列实际上都会发生,因此,例如N-gram语言模型的语言模型值必须从实际出现的N-1个字的单词序列中确定。 根据本发明,来自每个不同的完整字序列的这些缩减的字序列仅被计数一次,而不考虑完整字序列的实际发生频率,或者仅将训练数据中正好出现一次的训练序列减少 帐户。
    • 2. 发明授权
    • Method for determining the variation with time of a speech parameter and
arrangement for carryin out the method
    • 用于确定语音参数随时间变化的方法和用于进行该方法的布置
    • US4813075A
    • 1989-03-14
    • US125101
    • 1987-11-24
    • Hermann Ney
    • Hermann Ney
    • G10L25/90G10L5/00
    • G10L25/90
    • In a speech or speaker recognition system, a segment or sequence of speech parameter values are smoothed to a most probable sequency by Dynamic Programming. The method for determining the variation with time of a speech parameter is based on a speech signal which is subdivided into successive segments and an individual value exists in each segment and for each value of the parameter within a limited range of values. For the example of the fundamental voice frequency, a value has been generated in each speech segment with the aid of the AMDF (Average Magnitude Difference Function). The required variation now links a sequency of horizontally, vertically or diagonally directly adjacent speech parameter values to one another in such a manner that the sum of the associated individual values represents a minimum. In this arrangement, this sum is slightly magnified in diagonal or vertical sections since a horizontal variation is most probable. This magnification is controlled by certain fixed values which influence the smoothness of the variation.
    • 在语音或扬声器识别系统中,通过动态编程将语音参数值的段或序列平滑到最可能的顺序。 用于确定语音参数随时间的变化的方法基于语音信号,该语音信号被细分为连续的段,并且每个段中存在单个值,并且对于参数的每个值在有限的值范围内。 对于基本语音频率的例子,借助于AMDF(平均幅度差分函数),已经在每个语音段中产生了一个值。 所需的变化现在将水平,垂直或对角线直接相邻的语音参数值的顺序相互链接,使得相关联的单独值的和表示最小。 在这种布置中,由于水平变化最可能,所以这个总和在对角线或垂直截面上略微放大。 该倍率由影响变化平滑度的某些固定值控制。
    • 4. 发明授权
    • Method and apparatus for recognizing spoken words in a speech signal
    • 用于识别语音信号中的口语单词的方法和装置
    • US5613034A
    • 1997-03-18
    • US312495
    • 1994-09-26
    • Hermann NeyVolker Steinbiss
    • Hermann NeyVolker Steinbiss
    • G10L15/08G10L15/197G10L9/00
    • G10L15/08G10L15/197
    • In the recognition of coherent speech, language models are favourably used to increase the reliability of recognition, which models, for example, take into account the probabilities of word combinations, especially of word pairs. For this purpose, a language model value corresponding to this probability is added at boundaries between words. In several recognition methods, for example, when the vocabulary is built up from phonemes in the shape of a tree, it is not known at the start of the continuation of a hypothesis after a word end which word will actually follow, so that a language model value cannot be taken into account until at the end of the next word. Measures are given for achieving this in such a manner that as far as possible the optimal preceding word or the optimal preceding word sequence is taken into account for the language model value without the necessity of constructing a copy of the searching tree for each and every simultaneously ending preceding word sequence.
    • 在识别相干语音时,语言模型有利地用于增加识别的可靠性,例如,该模型考虑了字组合,特别是单词对的概率。 为此,将与该概率对应的语言模型值添加到单词之间的边界。 在几种识别方法中,例如,当词汇从树形形状的音素构建时,在词结束之后继续假设开始时,将不知道哪个单词将实际遵循,从而语言 直到下一个单词的结尾,才能考虑到模型值。 给出了实现这一点的措施,即尽可能地在语言模型值中考虑到最佳的前一个单词或最优的前一个单词序列,而不需要同时构建搜索树的副本 结束前面的单词序列。
    • 5. 发明授权
    • Method of recognizing coherently spoken words
    • 识别相关词汇的方法
    • US5058166A
    • 1991-10-15
    • US523305
    • 1990-05-11
    • Hermann NeyAndreas Noll
    • Hermann NeyAndreas Noll
    • G10L15/00G10L15/14
    • G10L15/00
    • During the recognition, speech values which are derived from sample values of the speech signals are compared with reference values, the words of a given vocabulary each time being given by a sequence of reference values. The words are then determined from phonemes according to a fixed pronouncing lexicon and the reference values for the phonemes are determined in a learning phase, each phoneme within a word consisting of a number of equal reference values determined in the learning phase. In order to approach transitions between phonemes, each phoneme may also consist of three sections of each time constant reference values. By the given number of reference values per phoneme, the time duration of a phoneme in a given word can be simulated more accurately. Different possibilities are indicated to determine the reference values and the distance value during the recognition.
    • 在识别期间,从语音信号的采样值导出的语音值与参考值进行比较,每次由参考值序列给出给定词汇表的单词。 然后根据固定的发音词典从音素确定这些单词,并且在学习阶段中确定音素的参考值,每个音素由在学习阶段中确定的相等参考值的数量组成。 为了接近音素之间的转换,每个音素也可以由每个时间常数参考值的三个部分组成。 通过每个音素的给定数量的参考值,可以更精确地模拟给定单词中的音素的持续时间。 指示不同的可能性以确定识别期间的参考值和距离值。
    • 7. 发明授权
    • Method and apparatus for recognizing spoken words in a speech signal by
organizing the vocabulary in the form of a tree
    • 通过以树的形式组织词汇来识别语音信号中的口语单词的方法和装置
    • US5995930A
    • 1999-11-30
    • US751377
    • 1996-11-19
    • Reinhold Hab-UmbachHermann Ney
    • Reinhold Hab-UmbachHermann Ney
    • G10L15/08G10L15/12G10L15/187G10L5/06
    • G10L15/187G10L15/08
    • A method and apparatus for processing a sequence of words in a speech signal for speech recognition. The method includes the steps of sampling, at recurrent instants, said speech signal for generating a series of test signals. Signal-by-signal matching and scoring is generated between the test signals and a series of reference signals, where each of the series of reference signals forms one of a plurality of vocabulary words arranged as a vocabulary tree. The vocabulary tree includes a root and a plurality of tree branches wherein any tree branch has a predetermined number of reference signals and is assigned to a speech element and any vocabulary word is assigned to a particular branch junction or branch end. Acoustic recombination determines both continuations of branches and the most probable partial hypotheses within a word because of the use of a vocabulary built up as a tree with branches having reference signals. At least one complete word for a particular test signal is determined, and, separately, for each completed word, there is: I) a word result formed including a word score and an aggregate score, said aggregate score derived from said word score and from a language model value assigned to a combination of said completed word and a uniform-length string of prior completed words.
    • 一种用于处理用于语音识别的语音信号中的单词序列的方法和装置。 该方法包括以下步骤:在复现时刻对用于产生一系列测试信号的所述语音信号进行采样。 在测试信号和一系列参考信号之间产生逐信号匹配和刻痕,其中每个参考信号系列中的每一个形成排列成词汇树的多个词汇表中的一个。 词汇树包括根和多个树分支,其中任何树枝具有预定数量的参考信号,并被分配给语音元素,并且任何词汇词被分配给特定的分支结或分支端。 声学重组决定了分支的延续和一个单词中最可能的部分假设,因为使用一个词汇构成一个具有参考信号的分支的树。 确定特定测试信号的至少一个完整单词,并且单独地,对于每个完成的单词,存在:I)形成的单词结果,包括单词得分和总分,所述总分从所述单词得分和 分配给所述完成词和前一完成词的均匀长度字符串的组合的语言模型值。
    • 8. 发明授权
    • Method of recognizing a sequence of words and device for carrying out
the method
    • 识别用于执行该方法的单词序列和装置的方法
    • US5946655A
    • 1999-08-31
    • US413051
    • 1995-03-29
    • Volker SteinbissBach-Hiep TranHermann Ney
    • Volker SteinbissBach-Hiep TranHermann Ney
    • G10L15/10G10L15/05G10L15/08G10L15/197G10L5/02
    • G10L15/197G10L15/05G10L15/08
    • When a language model is to be used for the recognition of a speech signal and the vocabulary is composed as a tree, the language model value cannot be taken into account before the word end. Customarily, after each word end the comparison with a tree root is started anew, be it with a score which has been increased by the language model value so that the threshold value for the scores at which hypotheses are terminated must be high and hence many, even unattractive hypotheses remain active for a prolonged period of time. In order to avoid this, in accordance with the invention a correction value is added to the score for at least a part of the nodes of the vocabulary tree; the sum of the correction values on the path to a word then may not be greater than the language model value for the relevant word. As a result, for each test signal the scores of all hypotheses are of a comparable order of magnitude. When a word end is reached, the sum of all added correction values is subtracted and the correct language model value is added.
    • 当语言模型被用于语音信号的识别并且词汇被组合成树时,语言模型值在单词结束之前不能被考虑。 通常情况下,在每个单词结束之后,重新开始与树根的比较,不管是具有通过语言模型值增加的分数,使得假设终止的分数的阈值必须很高,因此许多, 即使没有吸引力的假设仍然长时间保持活跃。 为了避免这种情况,根据本发明,对词汇树的至少一部分节点的分数添加校正值; 在一个字的路径上的校正值的和可能不会大于该相关词的语言模型值。 因此,对于每个测试信号,所有假设的分数具有可比的数量级。 当达到字结束时,减去所有添加的校正值的总和,并添加正确的语言模型值。
    • 9. 发明授权
    • Method of recognizing continuously spoken words
    • 识别连续口语的方法
    • US5005203A
    • 1991-04-02
    • US175085
    • 1988-03-30
    • Hermann Ney
    • Hermann Ney
    • G10L15/10G10L15/12G10L15/193
    • G10L15/193G10L15/12
    • A method of recognizing continuously spoken words in which, during the speech recognition, speech values derived from the speech signal are compared with comparison values of the individual words of a given vocabulary. In order to reduce the rate of recognition errors, it is essentially known to take into account speech models, which consequently admit only selected sequences of words. From the theory of the formal languages, a class of speech models designated as "context-free grammar" is known, which represents a comparatively flexible speech model. For the use of this speech model in the technical process of recognition of a speech signal two lists are now utilized, which indicate the assignment between words and given syntactical classes and the assignment of these classes to, as the case may be, two other classes. Both lists are used at each new speech signal in that there is constantly considered backwards, which class explains most clearly the preceding speech section. At the end of the speech signal, starting from the class indicating the whole sentence, the sequence of the words can be followed backwards, which has yielded the smallest total distance sum and which moreover fits into the speech model given by the two lists.
    • 识别连续口语的方法,其中在语音识别期间,将从语音信号导出的语音值与给定词汇表中各个单词的比较值进行比较。 为了降低识别错误的速度,基本上已知会考虑到语音模型,因此语音模型只允许所选择的单词序列。 从形式语言的理论来看,一类被称为“无上下文的语法”的语音模型是已知的,这是一种比较灵活的语音模型。 为了在语音信号识别的技术过程中使用该语音模型,现在使用两个列表,其中表示词和赋予的语法类之间的分配以及这些类的分配(视情况而定)为两个其他类 。 在每个新的语音信号中都使用这两个列表,因为它们一直被倒退,哪个类别最清楚地说明了前面的语音部分。 在语音信号结束时,从表示整个句子的类开始,字的序列可以跟随,这产生了最小的总距离和,并且还适合于由两个列表给出的语音模型。