会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Method and apparatus for performing pattern-specific maximum likelihood transformations for speaker recognition
    • 用于执行用于说话人识别的模式特异性最大似然变换的方法和装置
    • US06751590B1
    • 2004-06-15
    • US09592205
    • 2000-06-13
    • Upendra V. ChaudhariRamesh Ambat GopinathStephane Herman Maes
    • Upendra V. ChaudhariRamesh Ambat GopinathStephane Herman Maes
    • G10L1700
    • G10L17/02G10L17/04
    • The present invention uses acoustic feature transformations, referred to as pattern-specific maximum likelihood transformations (PSMLT), to model the voice print of speakers in either a text dependent or independent mode. Each transformation maximizes the likelihood, when restricting to diagonal models, of the speaker training data with respect to the resulting voice-print model in the new feature space. Speakers are recognized (i.e., identified, verified or classified) by appropriate comparison of the likelihood of the testing data in each transformed feature space and/or by directly comparing transformation matrices obtained during enrollment and testing. It is to be appreciated that the principle of pattern-specific maximum likelihood transformations can be extended to a large number of pattern matching problems and, in particular, to other biometrics besides speech.
    • 本发明使用称为模式特定最大似然变换(PSMLT)的声学特征变换来以文本依赖或独立模式对扬声器的语音印刷进行建模。 每个变换使得在限制对角线模型时相对于新特征空间中产生的语音印刷模型的扬声器训练数据的可能性最大化。 通过对每个变换的特征空间中的测试数据的可能性和/或通过直接比较在注册和测试期间获得的变换矩阵的适当比较来识别(即,识别,验证或分类)扬声器。 应当理解,模式特定的最大似然变换的原理可以扩展到大量的模式匹配问题,特别是除了语音之外的其他生物特征。
    • 3. 发明授权
    • Method and apparatus for speaker recognition
    • 用于说话者识别的方法和装置
    • US06349280B1
    • 2002-02-19
    • US09262083
    • 1999-03-04
    • Hiroaki Hattori
    • Hiroaki Hattori
    • G10L1700
    • G10L17/14G10L15/10
    • A method of recognizing a speaker of an input speech according to the distance between an input speech pattern, obtained by converting the input speech to a feature parameter series, and a reference pattern preliminarily registered as feature parameter series for each speaker is provided. Contents of the input and reference speech patterns is obtained by recognition. An identical section, in which the contents of the input and reference speech patterns are identical is determined. The distance between the input and reference speech patterns in the calculated identical content section is determined. The speaker of the input speech is recognized on the basis of the determined distance.
    • 提供了一种根据通过将输入语音转换为特征参数序列获得的输入语音模式之间的距离和预先登记为每个说话者的特征参数系列的参考模式来识别输入语音的扬声器的方法。 通过识别获得输入和参考语音模式的内容。 确定输入和参考语音模式的内容相同的相同部分。 确定所计算的相同内容部分中的输入和参考语音模式之间的距离。 基于确定的距离来识别输入语音的扬声器。
    • 4. 发明授权
    • Confirmation notification by apparatus using audio recognition as to the acceptability of an input sound
    • 使用音频识别装置对输入声音的可接受性的确认通知
    • US06338036B1
    • 2002-01-08
    • US09379358
    • 1999-08-23
    • Yasunaga Miyazawa
    • Yasunaga Miyazawa
    • G10L1700
    • G10L15/22
    • When a sound which is to be recognized is input to a device, this invention briefly informs the user of whether the sound has been input in an appropriate state. A sound inputting part which outputs a sound to be recognized, spoken by a user as a plurality of words forming one group, as digitized sound data, a sound analysis part which analyzes the sound data and calculates the sound power and characteristic data, a sound division detection/determination part which detects an effective sound division based upon the sound power which has been obtained in the sound analysis part and determines whether the sound to be recognized has been input in an appropriate state, based on the size of the sound power and the time length of the effective sound division, a sound recognition processing part in which the sound to be recognized is recognized and processed, and an information outputting part which outputs information which shows that the recognition object sound is appropriate immediately after the inputting of the sound to be recognized, are provided.
    • 当要识别的声音被输入到设备时,本发明简要地通知用户声音是否已经被输入到适当的状态。 声音输入部,其输出要被识别的声音,作为形成一组的多个字作为数字化声音数据,分析声音数据并计算声音功率和特征数据的声音分析部分,声音 基于声音分析部分中获得的声功率来检测有效声音分割,并且基于声音的大小来确定是否已经以适当的状态输入要识别的声音, 有效声音的时间长度,识别和处理要识别的声音的声音识别处理部分以及在输入声音之后立即输出表示识别对象声音适合的信息的信息输出部分 被提供。
    • 5. 发明授权
    • Method and apparatus for speaker identification using mixture discriminant analysis to develop speaker models
    • US06233555B1
    • 2001-05-15
    • US09198579
    • 1998-11-24
    • Sarangarajan ParthasarathyAaron E. Rosenberg
    • Sarangarajan ParthasarathyAaron E. Rosenberg
    • G10L1700
    • G10L17/04G10L17/24
    • A speaker identification system is provided that constructs speaker models using a discriminant analysis technique where the data in each class is modeled by Gaussian mixtures. The speaker identification method and apparatus determines the identity of a speaker, as one of a small group, based on a sentence-length password utterance. A speaker's utterance is received and a sequence of a first set of feature vectors are computed based on the received utterance. The first set of feature vectors are then transformed into a second set of feature vectors using transformations specific to a particular segmentation unit, and likelihood scores of the second set of feature vectors are computed using speaker models trained using mixture discriminant analysis. The likelihood scores are then combined to determine an utterance score and the speaker's identity is validated based on the utterance score. The speaker identification method and apparatus also includes training and enrollment phases. In the enrollment phase the speaker's password utterance is received multiple times. A transcription of the password utterance as a sequence of phones is obtained, and the phone string is stored in a database containing phone strings of other speakers in the group. In the training phase, the first set of feature vectors are extracted from each password utterance and the phone boundaries for each phone in the password transcription are obtained using a speaker independent phone recognizer. A mixture model is developed for each phone of a given speaker's password. Then, using the feature vectors from the password utterances of all of the speakers in the group, transformation parameters and transformed models are generated for each phone and speaker, using mixture discriminant analysis.
    • 6. 发明授权
    • Speech recognition of caller identifiers using location information
    • 使用位置信息对呼叫者标识符进行语音识别
    • US06223156B1
    • 2001-04-24
    • US09056172
    • 1998-04-07
    • Randy G. GoldbergRoy Philip Weber
    • Randy G. GoldbergRoy Philip Weber
    • G10L1700
    • H04M1/271G10L15/24H04M1/57
    • A speech recognition system recognizes a caller identifier received during a telephone call as a speech signal from a caller. The system generates a plurality of caller identifier choices from the speech signal and receives location information of the caller. The system includes a database on which is stored a plurality of caller identifiers indexed to a plurality of location information. The system queries a database based on the received location information and retrieves one or more caller identifiers from the database. The system then selects the recognized caller identifier from the plurality of caller identifier choices based on the retrieved one or more caller identifiers.
    • 语音识别系统将在电话呼叫期间接收到的呼叫者标识识别为来自呼叫者的语音信号。 该系统从语音信号生成多个呼叫者识别符选择,并接收呼叫者的位置信息。 该系统包括数据库,在其上存储有索引到多个位置信息的多个呼叫者识别符。 系统根据接收到的位置信息查询数据库,并从数据库中检索一个或多个呼叫者标识符。 然后,系统基于所检索的一个或多个呼叫者标识符从多个呼叫者标识符选择中选择识别的主叫方标识符。
    • 7. 发明授权
    • Speaker recognition over large population with fast and detailed matches
    • 演讲者对大量人群的认识快速而详细
    • US06182037B2
    • 2001-01-30
    • US08851982
    • 1997-05-06
    • Stephane Herman Maes
    • Stephane Herman Maes
    • G10L1700
    • G10L17/06
    • Fast and detailed match techniques for speaker recognition are combined into a hybrid system in which speakers are associated in groups when potential confusion is detected between a speaker being enrolled and a previously enrolled speaker. Thus the detailed match techniques are invoked only at the potential onset of saturation of the fast match technique while the detailed match is facilitated by limitation of comparisons to the group and the development of speaker-dependent models which principally function to distinguish between members of a group rather than to more fully characterize each speaker. Thus storage and computational requirements are limited and fast and accurate speaker recognition can be extended over populations of speakers which would degrade or saturate fast match systems and degrade performance of detailed match systems.
    • 用于说话人识别的快速和详细的匹配技术被组合成混合系统,其中当在被登记的扬声器和先前登记的扬声器之间检测到潜在的混乱时,扬声器被组合在一起。 因此,详细匹配技术仅在快速匹配技术的潜在的饱和开始时被调用,而通过与组的比较的限制和说明者依赖模型的发展来促进详细匹配,主要用于区分组的成员 而不是更充分地表征每个演讲者。 因此,存储和计算需求是有限的,并且可以在扬声器的群体上扩展快速和准确的扬声器识别,这将使得快速匹配系统降级或饱和,并降低详细匹配系统的性能。
    • 10. 发明授权
    • Speech collating apparatus and speech collating method
    • 语音整理装置和语音整理方法
    • US06718306B1
    • 2004-04-06
    • US09690669
    • 2000-10-17
    • Katsuhiko SatohTsuneharu Takeda
    • Katsuhiko SatohTsuneharu Takeda
    • G10L1700
    • G10L17/08G10L17/00G10L21/06
    • A speech of a registered speaker input from an input unit is converted by a converting unit to a sound spectrogram “A” and stored. As a speech of a speaker to be identified is input from the input unit and converted to a sound spectrogram “B” by the converting unit, a detecting unit detects a partial image including a plurality of templates placed in the registered speech image A by a placing unit, and each of areas on the unknown speech image B in which maximum correlation coefficients are calculated. Then, a determining unit compares a mutual positional relationship of the plurality of templates with a mutual positional relationship of the respective areas in which the maximum correlation coefficients are detected to determine from the degree of difference therebetween the identity between the registered speech and the unknown speech.
    • 从输入单元输入的已登记扬声器的语音由转换单元转换成声谱“A”并被存储。 当要被识别的扬声器的语音从输入单元输入并由转换单元转换成声谱“B”时,检测单元通过一个检测单元检测包括放置在登记语音图像A中的多个模板的部分图像 放置单元以及计算最大相关系数的未知语音图像B上的每个区域。 然后,确定单元将多个模板的相互位置关系与其中检测最大相关系数的各个区域的相互位置关系根据登记语音和未知语音之间的身份之间的差异程度来确定 。