会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • SPEAKER AUTHENTICATION
    • 演讲者认证
    • WO2007098039A1
    • 2007-08-30
    • PCT/US2007/004137
    • 2007-02-13
    • MICROSOFT CORPORATION
    • ZHANG, ZhengyouLIU, Ming
    • G10L17/00G10L15/18G10L15/14
    • G10L17/20G10L17/08
    • Speaker authentication is performed by determining a similarity score for a test utterance and a stored training utterance. Computing the similarity score involves determining the sum of a group of functions, where each function includes the product of a posterior probability of a mixture component and a difference between an adapted mean and a background mean. The adapted mean is formed based on the background mean and the test utterance. The speech content provided by the speaker for authentication can be text-independent (i.e., any content they want to say) or text-dependent (i.e., a particular phrase used for training).
    • 通过确定测试话语和存储的训练话语的相似性得分来执行扬声器认证。 计算相似度分数涉及确定一组函数的和,其中每个函数包括混合分量的后验概率与适应平均值与背景平均值之间的差的乘积。 适应平均值是基于背景均值和测试语音形成的。 用于认证的说话者提供的语音内容可以是文本无关的(即,他们想说的任何内容)或文本依赖(即,用于训练的特定短语)。
    • 2. 发明申请
    • “扬声器识别系统”
    • WO2002103680A2
    • 2002-12-27
    • PCT/GB2002/002726
    • 2002-06-13
    • SECURIVOX LTDSAPELUK, Andrew, Thomas
    • SAPELUK, Andrew, Thomas
    • G10L17/00
    • G10L17/02G10L17/12G10L17/20
    • Speaker recognition (identification and/or verification) methods and systems, in which speech models for enrolled speakers consist of sets of feature vectors representing the smoothed frequency spectrum of each of a plurality of frames and a clustering algorithm is applied to the feature vectors of the frames to obtain a reduced data set representing the original speech sample, and wherein the adjacent frames are overlapped by at least 80 %. Speech models of this type model the static components of the speech sample and exhibit temporal independence. An identifier strategy is employed in which modelling and classification processes are selected to give a false rejection rate substantially equal to zero. Each enrolled speaker is associated with a cohort of a predetermined number of other enrolled speakers and a test sample is always matched with either the claimed identity or one of its associated cohort. This makes the overall error rate of the system dependent only on the false acceptance rate, which is determined by the cohort size. The false error rate is further reduced by use of multiple parallel modelling and/or classification processes. Speech models are normalised prior to classification using a normalisation model derived from either the test speech sample or one of the enrolled speaker samples (most preferably from the claimed identity enrolment sample).
    • 扬声器识别(识别和/或验证)方法和系统,其中登记的扬声器的语音模型由表示多个帧中的每一个的平滑频谱的特征向量集合和聚类算法应用于 帧以获得表示原始语音样本的缩减数据集,并且其中相邻帧重叠至少80%。 这种类型的语音模型模拟语音样本的静态组件并呈现时间独立性。 采用标识符策略,其中选择建模和分类处理以给出基本等于零的错误拒绝率。 每个登记的说话者与预定数量的其他注册的发言人的队列相关联,并且测试样本总是与所要求保护的身份或其相关联的队列中的一个匹配。 这使得系统的总体错误率仅取决于由队列大小确定的错误接受率。 通过使用多个并行建模和/或分类过程进一步降低了错误错误率。 语音模型在使用从测试语音样本或所登记的说话者样本(最优选来自所要求的身份登记样本)导出的归一化模型之前进行归一化。
    • 3. 发明申请
    • CHANNEL ESTIMATION SYSTEM AND METHOD FOR USE IN AUTOMATIC SPEAKER VERIFICATION SYSTEMS
    • 用于自动语音识别系统的信道估计系统和方法
    • WO99059136A1
    • 1999-11-18
    • PCT/US1999/010038
    • 1999-05-07
    • G10L15/06G10L17/04G10L17/20G10L21/02G10L5/06G10L7/08
    • G10L17/20G10L15/063G10L17/04G10L21/02
    • The voice print system of the present invention concerns an automatic speaker verification (ASV) system that is subword-based and text-dependent with no constraints on the choice of vocabulary words or language. One component of the preferred ASV system is a channel estimation and normalization component that is able to remove the characteristics of the test channel component (150) and/or enrollment channel component (90) to increase accuracy. The preferred methods and systems of the present invention termed Curve-Fitting (62, 64, 66) and Clean Speech (82, 86, 88, 90, 92), separately, together, and in combination with Pole filtering (42, 44, 46), significantly improve the existing methods of channel estimation and normalization. Unlike Cepstral Mean Subtraction, both Curve-Fitting (62, 64, 66) and Clean Speech (42, 44, 46) methods and systems extract only the channel related information from the cepstral mean and not any speech information.
    • 本发明的语音打印系统涉及一种自动说话人验证(ASV)系统,它是基于词语和文本依赖的,而不限制词汇词或语言的选择。 优选的ASV系统的一个组件是能够去除测试信道组件(150)和/或注册信道组件(90)的特性以增加精度的信道估计和归一化组件。 本发明的优选方法和系统分别称为曲线拟合(62,64,66)和清洁语音(82,86,88,90,92),一起并结合极点滤波(42,44,46) 46),显着提高了信道估计和规范化的现有方法。 与倒谱平均减法不同,曲线拟合(62,64,66)和清晰语音(42,44,46)的方法和系统仅从倒谱平均值提取信道相关信息,而不是任何语音信息。
    • 5. 发明申请
    • DYNAMIC THRESHOLD FOR SPEAKER VERIFICATION
    • 用于演讲者验证的动态阈值
    • WO2015199813A1
    • 2015-12-30
    • PCT/US2015/028859
    • 2015-05-01
    • GOOGLE INC.
    • FOERSTER, Jakob NicolausCASADO, Diego Melendo
    • G10L17/08G10L17/12G10L17/22H04M3/38
    • G10L17/20G06F3/167G10L17/005G10L17/02G10L17/04G10L17/06G10L17/08G10L17/12G10L17/22G10L17/24G10L25/84H04M3/385
    • Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a dynamic threshold for speaker verification are disclosed. In one aspect, a method includes the actions of receiving, for each of multiple utterances of a hotword, a data set including at least a speaker verification confidence score, and environmental context data. The actions further include selecting from among the data sets, a subset of the data sets that are associated with a particular environmental context. The actions further include selecting a particular data set from among the subset of data sets based on one or more selection criteria. The actions further include selecting, as a speaker verification threshold for the particular environmental context, the speaker verification confidence score. The actions further include providing the speaker verification threshold for use in performing speaker verification of utterances that are associated with the particular environmental context.
    • 公开了用于说话人验证的动态阈值的方法,系统和装置,包括在计算机存储介质上编码的计算机程序。 在一个方面,一种方法包括针对热词的多个话语中的每一个接收包括至少说话人验证置信度得分和环境上下文数据的数据集的动作。 动作还包括从数据集中选择与特定环境上下文相关联的数据集的子集。 动作还包括基于一个或多个选择标准从数据集的子集中选择特定数据集。 该动作进一步包括作为特定环境背景的说话者验证阈值来选择说话者验证置信度得分。 该动作进一步包括提供说话者验证阈值,以用于执行与特定环境背景相关联的话语的说话者验证。
    • 8. 发明申请
    • SPEAKER VERIFICATION
    • 扬声器验证
    • WO2010049695A1
    • 2010-05-06
    • PCT/GB2009/002579
    • 2009-10-29
    • BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANYARIYAEENINIA, Aladdin, MohammadPILLAY, Surosh, GovinasamyPAWLEWSKI, Mark
    • ARIYAEENINIA, Aladdin, MohammadPILLAY, Surosh, GovinasamyPAWLEWSKI, Mark
    • G10L17/00
    • G10L17/12G10L17/20
    • A speaker verification method is proposed that first builds a general model of user utterances using a set of general training speech data. The user also trains the system by providing a training utterance, such as a passphrase or other spoken utterance. Then in a test phase, the user provides a test utterance which includes some background noise as well as a test voice sample. The background noise is used to bring the condition of the training data closer to that of the test voice sample by modifying the training data and a reduced set of the general data, before creating adapted training and general models. Match scores are generated based on the comparison between the adapted models and the test voice sample, with a final match score calculated based on the difference between the match scores. This final match score gives a measure of the degree of matching between the test voice sample and the training utterance and is based on the degree of matching between the speech characteristics from extracted feature vectors that make up the respective speech signals, and is not a direct comparison of the raw signals themselves. Thus, the method can be used to verify a speaker without necessarily requiring the speaker to provide an identical test phrase to the phrase provided in the training sample.
    • 提出了一种说话人验证方法,其首先使用一组一般训练语音数据构建用户话语的一般模型。 用户还通过提供训练话语来训练系统,例如口令或其他口语说话。 然后在测试阶段,用户提供测试话语,其包括一些背景噪声以及测试语音样本。 背景噪声用于在创建适应的训练和一般模型之前,通过修改训练数据和减少的一般数据集,使训练数据的状况更接近于测试语音样本的状态。 基于适应模型和测试语音样本之间的比较产生匹配分数,根据匹配分数之间的差异计算最终匹配分数。 该最终匹配分数给出测试语音样本和训练话语之间的匹配程度的度量,并且基于来自提取的组成各个语音信号的特征向量的语音特征之间的匹配程度,并且不是直接的 原始信号本身的比较。 因此,该方法可用于验证扬声器,而不一定要求扬声器为训练样本中提供的短语提供相同的测试短语。
    • 9. 发明申请
    • METHOD AND SYSTEM FOR ESTABLISHING HANDSET-DEPENDENT NORMALIZING MODELS FOR SPEAKER RECOGNITION
    • 用于建立用于语音识别的手机相关正规化模型的方法和系统
    • WO98038632A1
    • 1998-09-03
    • PCT/US1998/003750
    • 1998-02-24
    • G10L15/20G10L17/00G10L9/06
    • G10L17/20G10L15/20G10L17/00
    • A method and apparatus is provided for establishing a normalizing model suitable for use with a speaker model to normalize the speaker model, the speaker model for modelling voice characteristics of a specific individual, the speaker model and the normalizing model for use in recognizing identity of a speaker. A normalizer module (231) within a scoring module (215) uses the normalizing score (229) to normalize the speaker score (225) thereby obtaining a normalized speaker score (217). Based on the normalized speaker score (217), a decision module (219) makes a decision (221) of whether to believe that the test speaker (203), whose utterance was the source of the speech data (213), is the reference speaker (403).
    • 提供了一种方法和装置,用于建立适合于与扬声器模型一起使用以使说话者模型正常化的规范化模型,用于建模特定个人的语音特征的扬声器模型,说话者模型和用于识别身份的标准化模型 扬声器。 评分模块(215)内的归一化模块(231)使用归一化分数(229)来标准化说话人得分(225),从而获得归一化的说话者得分(217)。 基于归一化的说话者得分(217),判定模块(219)作出判定(221)是否相信其话语是语音数据(213)的来源的测试说话者(203)是参考 扬声器(403)。