会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • System for voice verification using matched frames
    • 使用匹配帧进行语音验证的系统
    • US06308153B1
    • 2001-10-23
    • US09307373
    • 1999-05-07
    • William Y. HuangLawrence G. BahlerAlan L. Higgins
    • William Y. HuangLawrence G. BahlerAlan L. Higgins
    • G10L1700
    • G10L17/24
    • A system and a method is disclosed for verifying a voice of a user conducting a telephone transaction. The system and method includes a mechanism for prompting the user to speak in a limited vocabulary. A feature extractor converts the limited vocabulary into a plurality of speech frames. A pre-processor is coupled to the feature extractor for processing the plurality of speech frames to produce a plurality of processed frames. The processing includes frame selection, which eliminates each of the plurality of speech frames having an absence of words. A Viterbi decoder is also coupled to said feature extractor for assigning a frame label to each of the plurality of speech frames to produce a plurality of frame labels. The processed frames and frame labels are then combined to produce a voice model, which includes each of the plurality of frame labels that correspond to the number of plurality of processed frames. A mechanism is also provided for comparing the voice model with the claimant's voice model, derived during a previous enrollment session. The voice model also is compared with an alternate voice model set, derived during previous enrollment sessions. The identity claimed is accepted if the voice model matches the claimant's voice model better than the alternative voice model set.
    • 公开了一种用于验证进行电话交易的用户的语音的系统和方法。 该系统和方法包括用于提示用户以有限的词汇表达的机制。 特征提取器将有限词汇转换成多个语音帧。 预处理器耦合到特征提取器,用于处理多个语音帧以产生多个经处理的帧。 该处理包括帧选择,其消除了不存在字的多个语音帧中的每一个。 维特比解码器还耦合到所述特征提取器,用于将帧标签分配给多个语音帧中的每一个以产生多个帧标签。 然后,处理的帧和帧标签被组合以产生语音模型,其包括与多个处理帧的数量相对应的多个帧标签中的每一个。 还提供了一种机制,用于将语音模型与在先前注册会话期间派生的索赔人的语音模型进行比较。 语音模型也与以前的注册会话中派生的替代语音模型集进行比较。 如果语音模型比替代语音模型集更好地与索赔人的语音模型匹配,则所接受的身份被接受。
    • 2. 发明授权
    • Continuous speech recognition method for improving false alarm rates
    • 连续语音识别方法提高误报率
    • US4241329A
    • 1980-12-23
    • US901005
    • 1978-04-27
    • Lawrence G. BahlerStephen L. Moshier
    • Lawrence G. BahlerStephen L. Moshier
    • G10L15/10G10L15/00G10L15/02G10L1/00
    • G10L15/00
    • A speech recognition method for detecting and recognizing one or more keywords in a continuous audio signal is disclosed. Each keyword is represented by a keyword template representing one or more target patterns, and each target pattern comprises statistics of each of at least one spectrum selected from plural short-term spectra generated according to a predetermined system for processing of the incoming audio. The incoming audio spectra are compared with the target patterns of the keyword templates and candidate keywords are selected according to a predetermined decision process. In post-decision processing, concatentation techniques, based upon a likelihood ratio test, for rejecting false alarms are disclosed. Post-decision processing can include also a prosodic test to enhance the effectiveness of the recognition apparatus.
    • 公开了一种用于检测和识别连续音频信号中的一个或多个关键词的语音识别方法。 每个关键字由表示一个或多个目标模式的关键字模板表示,并且每个目标模式包括从根据用于处理输入音频的预定系统生成的多个短期谱中选择的至少一个频谱中的每一个的统计。 将输入音频谱与关键字模板的目标模式进行比较,并根据预定的决定过程来选择候选关键词。 在后决策处理中,公开了基于似然比检验的用于拒绝假警报的级联技术。 后决策处理还可以包括用于增强识别装置有效性的韵律测试。
    • 3. 发明授权
    • Automatic confirmation of personal notifications
    • 自动确认个人通知
    • US06876987B2
    • 2005-04-05
    • US09772651
    • 2001-01-30
    • Lawrence G. BahlerAlan L. Higgins
    • Lawrence G. BahlerAlan L. Higgins
    • G10L15/26G10L17/00H04M3/533G06F15/18
    • G10L17/24G10L15/26H04M3/533H04M2203/2083
    • An automated system obtains a confirmation of receipt of a notification by the intended recipient by having the recipient speak all or part of the notification. The words spoken by the recipient are determined by a computerized system using an automatic speech recognition algorithm. The computerized system determines whether the words spoken are those of the notification, and if they are, the system accepts the confirmation from the recipient. Optionally, the system additionally applies an automatic speaker recognition algorithm to determine whether the person reciting the notification has similar voice characteristics to the intended recipient based on a previous enrollment of the intended recipient's voice. The system can also record the recipient reciting the notification so that it can later be compared to the intended recipient's voice if the intended recipient repudiates the confirmation.
    • 自动化系统通过使接收者全部或部分通知获得预期收件人的通知的确认。 接收者所说的话由使用自动语音识别算法的计算机化系统确定。 计算机化系统确定说出的单词是否是通知单,如果是,则系统接受来自接收者的确认。 可选地,系统另外应用自动说话者识别算法,以基于预期接收者的语音的先前注册来确定背诵通知的人是否具有与预期接收者相似的语音特征。 该系统还可以记录接收者背诵通知,以便以后可以将其与预期收件人的语音进行比较,如果预期收件人拒绝确认。
    • 4. 发明授权
    • Continuous speech recognition
    • 连续语音识别
    • US4481593A
    • 1984-11-06
    • US309209
    • 1981-10-05
    • Lawrence G. Bahler
    • Lawrence G. Bahler
    • G10L11/00G10L15/00G10L15/04G10L15/06G10L15/10G10L15/12G10L15/18G10L15/28G10L1/00
    • G10L15/05G10L15/12G10L15/193G10L2015/088
    • An improved speech recognition method and apparatus for recognizing keywords in a continuous audio signal are disclosed. The keywords, generally either a word or a string of words, are each represented by an element template defined by a plurality of target patterns. Each target pattern is represented by a plurality of statistics describing the expected behavior of a group of spectra selected from plural short-term spectra generated by processing of the incoming audio. The incoming audio spectra are processed to enhance the separation between the spectral pattern classes during later analysis. The processed audio spectra are grouped into multi-frame spectral patterns and are compared, using likelihood statistics, with the target patterns of the element templates. Each multi-frame pattern is forced to contribute to each of a plurality of pattern scores as represented by the element templates. The method and apparatus use speaker independent word models during the training stage to generate, automatically, improved target patterns. The apparatus and method further employ grammatical syntax during the training stage for identifying the beginning and ending boundaries of unknown keywords. Recognition is further improved by use of a plurality of templates representing "silence" or non-speech signals, for example, hum. Also, memory and computation load is reduced by use of modified (collapsed or folded) syntax flow graph logic, implemented by additional (augment) control numbers. A concatenation technique is employed, using dynamic programming techniques, to determine the correct identity of the word string.
    • 公开了一种用于识别连续音频信号中的关键词的改进的语音识别方法和装置。 这些关键字通常是单词或一串单词都由由多个目标模式定义的单元模板来表示。 每个目标图案由描述从通过处理输入音频产生的多个短期光谱中选出的一组光谱的预期行为的多个统计表示。 处理输入音频频谱以在后续分析期间增强光谱图案类别之间的间隔。 经处理的音频频谱被分组成多帧频谱模式,并使用似然统计量与元素模板的目标模式进行比较。 每个多帧图案被强制地贡献于由元素模板表示的多个模式分数中的每一个。 该方法和装置在训练阶段使用讲话者独立的单词模型来生成,自动地改进目标模式。 该装置和方法在训练阶段进一步使用语法语法来识别未知关键词的开始和结束边界。 通过使用表示“沉默”或非语音信号的多个模板(例如嗡嗡声)进一步提高识别。 此外,通过使用由附加(增加)控制数字实现的修改(折叠或折叠)语法流图图逻辑来减少存储器和计算负载。 采用串联技术,使用动态规划技术来确定字串的正确身份。
    • 5. 发明授权
    • Speaker independent speech recognition method utilizing multiple
training iterations
    • 使用多次训练迭代的扬声器独立语音识别方法
    • US5806034A
    • 1998-09-08
    • US510321
    • 1995-08-02
    • Joe A. NaylorWilliam Y. HuangLawrence G. Bahler
    • Joe A. NaylorWilliam Y. HuangLawrence G. Bahler
    • G10L15/14G10L9/06
    • G10L15/144
    • A method for recognizing spoken utterances of a speaker is disclosed, the method comprising the steps of providing a database of labeled speech data; providing a prototype of a Hidden Markov Model (HMM) definition to define the characteristics of the HMM; and parameterizing speech utterances according to one of linear prediction parameters or Mel-scale filter bank parameters. The method further includes selecting a frame period for accommodating the parameters and generating HMMs and decoding to specified speech utterances by causing the user to utter predefined training speech utterances for each HMM. The method then statistically computes the generated HMMs with the prototype HMM to provide a set of fully trained HMMs for each utterance indicative of the speaker. The trained HMMs are used for recognizing a speaker by computing Laplacian distances via distance table lookup for utterances of the speaker during the selected frame period; and iteratively decoding node transitions corresponding to the spoken utterances during the selected frame period to determine which predefined utterance is present.
    • 公开了一种用于识别扬声器的讲话语音的方法,所述方法包括以下步骤:提供标记语音数据的数据库; 提供隐马尔可夫模型(HMM)定义的原型来定义HMM的特征; 并根据线性预测参数或Mel-scale滤波器组参数之一参数化语音话语。 该方法还包括通过使用户对每个HMM发出预定义的训练语音话语来选择用于容纳参数的帧周期和生成HMM并解码为指定的语音话语。 该方法然后用原型HMM统计计算所生成的HMM,以便为指示说话者的每个话语提供一组经过充分训练的HMM。 所训练的HMM用于通过在所选择的帧周期期间通过对说话者的话语的距离表查找来计算拉普拉斯算子来识别扬声器; 并且在所选择的帧周期期间迭代地解码对应于所说话语音的节点转换,以确定哪个预定义的话语存在。
    • 6. 发明授权
    • Keyword recognition system and method using template concantenation model
    • 关键词识别系统和使用模板综合模型的方法
    • US5218668A
    • 1993-06-08
    • US961014
    • 1992-10-14
    • Alan L. HigginsRobert E. WohlfordLawrence G. Bahler
    • Alan L. HigginsRobert E. WohlfordLawrence G. Bahler
    • G10L11/02G10L15/00
    • G10L25/87G10L15/00
    • Similarity is measured between incoming speech and a plurality of candidate strings of filler and keyword templates to test the hypothesis of a keyword being present against the hypothesis of a keyword not being present in the input speech. As alternatives to a candidate string containing a keyword template, other candidate strings are assembled with filler templates of short speech sounds that may be similar to a keyword. A keyword is "recognized" only when an optimal candidate string containing the corresponding keyword template is determined to match the input speech more closely than all other candidate strings. A concatenation penalty is added to the partial string score each time a new template is added to a candidate string, in order to bias the score in favor of a candidate string containing a longer, keyword template.
    • 在输入语音和多个填充符和关键字模板的候选字符串之间测量相似性,以测试针对不在输入语音中的关键字的假设存在的关键字的假设。 作为包含关键字模板的候选字符串的替换,其他候选字符串与可能类似于关键字的短语音的填充模板组合。 只有当确定包含相应的关键字模板的最佳候选字符串以匹配输入语音以比所有其他候选字符串更接近时,关键字才被“识别”。 每当将新模板添加到候选字符串时,将部分字符串分数添加到级联惩罚,以便偏向该分数以有利于包含较长关键字模板的候选字符串。
    • 7. 发明授权
    • Automatic speech recognition system using seed templates
    • 自动语音识别系统使用种子模板
    • US4994983A
    • 1991-02-19
    • US346054
    • 1989-05-02
    • Blakely P. LandellRobert E. WohlfordLawrence G. Bahler
    • Blakely P. LandellRobert E. WohlfordLawrence G. Bahler
    • G10L15/06
    • G10L15/063
    • An automatic speech recognition system has a multi-mode training capability using a set of previously stored templates of a limited number of predetermined seed words to train the templates for a vocabulary of words. The training speech samples each includes a vocabulary word juxtaposed with a seed word. An averager module maintains an active average template for each of the word units of the training speech samples including the seed word units, and the active average templates are used to continuously update the seed template set as they are used in the training speech samples. The preferred training procedure employs training phrases each having a vocabulary word embedded between two seed words, and two seed template sets are used in succession, the first being composed of single-digit words, and the second composed of carrier words.
    • 自动语音识别系统具有多模式训练能力,其使用一组先前存储的有限数量的预定种子词的模板来训练词汇词汇的模板。 训练语音样本每个包括与种子词并置的词汇单词。 平均模块维持包括种子单位的训练语音样本的每个单词单元的活动平均模板,并且使用活动平均模板来连续地更新种子模板集合,因为它们在训练语音样本中使用。 优选的训练过程使用训练短语,每个训练短语都具有嵌入在两个种子单词之间的词汇词,并且两个种子模板组被连续使用,第一个由单位字组成,第二个由载体字组成。
    • 9. 发明授权
    • System and method for passive voice verification in a telephone network
    • 电话网络中被动​​语音验证的系统和方法
    • US5414755A
    • 1995-05-09
    • US105849
    • 1994-08-10
    • Lawrence G. BahlerAlan L. Higgins
    • Lawrence G. BahlerAlan L. Higgins
    • G10L17/00H04M3/38H04M15/00H04Q3/70H04M11/00
    • G10L17/22G10L17/00H04M15/00H04M3/382H04Q3/70H04M2201/40
    • A telephone long distance service is provided using speaker verification to determine whether a user is a valid user or an impostor. The user claims an identity by offering some form of identification, typically by entering a calling card number on the phone's touch-tone keypad or by a magnetic strip on the card which is read by the telephone. Unrestricted, extemporaneous speech of a group of customers are digitized, analyzed in accordance with a PCM circuit, and characterized as a non-parametric set of speech feature vectors. The extemporaneous speech of the long distance telephone service user claiming the identity of a service customer via his card number is digitized and analyzed in a like manner. The identity of the user is verified by comparing, either during or after the call, the signals in accordance with an algorithm which compares a reference utterance of a known customer with input utterances from one or more unknown telephone service users, one of which users has claimed the identity of the customer. This comparison results in a decision to accept or reject the hypothesized identity. The identity hypothesis to be tested is thus derived from the calling card of the customer.
    • 使用讲话人验证提供电话长途服务,以确定用户是否是有效用户还是冒名顶替者。 用户通过提供某种形式的标识来声称身份,通常通过在电话的按键式键盘上输入电话卡号码或通过由电话读取的卡上的磁条。 一组客户的无限制,即时的语音被数字化,根据PCM电路进行分析,并被表征为非参数语言特征向量集合。 长时间电话服务用户通过他的卡号声明服务用户的身份的即时语音以类似的方式进行数字化和分析。 通过在通话期间或之后比较信号,根据将已知客户的参考发音与来自一个或多个未知电话服务用户的输入话语进行比较的算法进行比较来验证用户的身份,其中一个用户具有 声称客户的身份。 这种比较导致接受或拒绝假设身份的决定。 因此,要测试的身份假设来源于客户的电话卡。
    • 10. 发明授权
    • Automated sorting of voice messages through speaker spotting
    • 通过扬声器识别自动排序语音信息
    • US5271088A
    • 1993-12-14
    • US44546
    • 1993-04-07
    • Lawrence G. Bahler
    • Lawrence G. Bahler
    • G10L17/00G10L21/02G10L5/06
    • G10L17/06G10L21/028
    • A speaker recognition apparatus employs a non-parametric baseline algorithm for speaker recognition which characterizes a given speaker's speech patterns by a set of speech feature vectors, and generates match scores which are sums of a ScoreA set equal to the average of the minimum Euclidean squared distance between the unknown speech frame and all reference frames of a given speaker over all frames of the unknown input, and ScoreB set equal to the average of the minimum Euclidean squared distance between each frame of the reference set to all frames of the unknown input. The performance on a queue of talkers is further improved by normalization of reference message match distances. The improved baseline algorithm addresses the co-channel problem of speaker spotting when plural speech signals are intermixed on the same channel by using a union of reference sets for pairs of speakers as the reference set for a co-channel signal, and/or by conversational state modelling.
    • 扬声器识别装置采用用于说话者识别的非参数基线算法,其通过一组语音特征向量来表征给定说话者的语音模式,并且生成匹配得分,其是等于最小欧几里德平方距离的平均值的ScoreA的和 在未知语音帧和未知输入的所有帧之间的给定说话者的所有参考帧之间,并且ScoreB设置等于参考集的每个帧与未知输入的所有帧之间的最小欧几里德平方距离的平均值。 通过参考消息匹配距离的归一化,进一步提高了通话者队列的性能。 改进的基线算法解决了当多个语音信号被混合在同一信道上时通过使用用于一对扬声器的参考集的并集作为用于同频道信号的参考集合和/或通过会话 状态建模。