会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 81. 发明申请
    • DIGITAL VOICE SIGNATURE OF TRANSACTIONS
    • 数字电话签名
    • US20150187359A1
    • 2015-07-02
    • US14644129
    • 2015-03-10
    • Ack3 Bionetics Pte Limited
    • Sajit Bhaskaran
    • G10L17/24G10L15/26
    • G10L17/24G07C9/00071G10L15/26G10L15/265G10L17/04G10L17/08
    • A method that includes receiving, by a server, an access request sent to a network address of a resource server from a user using a user device, the access request comprising a unique record identifier is provided. The method includes placing a call to the user device, receiving from the user a voice response to a prompt associated with an implied security question for the user, comparing the voice response of the user with a selected voice biometrics record, converting the voice response into a speech-to-text phrase, and comparing the speech-to-text phrase against a stored secret text phrase to verify that the speech-to-text phrase matches an answer to the silent security question. A method for signing a transaction, including collecting a plurality of voice samples from a user during a transaction and concatenating the plurality of voice samples into a single sound file is also provided.
    • 一种方法,其包括由服务器接收从用户使用用户设备发送到资源服务器的网络地址的访问请求,提供包括唯一记录标识符的访问请求。 该方法包括向用户设备发出呼叫,从用户接收对与用户的隐含安全问题相关联的提示的语音响应,将用户的语音响应与所选择的语音生物测定记录进行比较,将语音响应转换成 语音到文本短语,以及将语音到文本短语与存储的秘密文本短语进行比较,以验证语音到文本短语与静音安全问题的答案相匹配。 还提供了一种用于签署交易的方法,包括在交易期间从用户收集多个语音样本并将多个语音样本连接成单个声音文件。
    • 83. 发明申请
    • User Programmable Voice Command Recognition Based On Sparse Features
    • 基于稀疏特征的用户可编程语音指令识别
    • US20150073795A1
    • 2015-03-12
    • US14458688
    • 2014-08-13
    • Texas Instruments Incorporated
    • Bozhao Tan
    • G10L15/06
    • G10L17/04G10L15/02G10L15/063G10L17/02G10L17/08G10L17/22G10L25/09G10L25/18
    • A low power sound recognition sensor is configured to receive an analog signal that may contain a signature sound. Sparse sound parameter information is extracted from the analog signal. The extracted sparse sound parameter information is processed using a speaker dependent sound signature database stored in the sound recognition sensor to identify sounds or speech contained in the analog signal. The sound signature database may include several user enrollments for a sound command each representing an entire word or multiword phrase. The extracted sparse sound parameter information may be compared to the multiple user enrolled signatures using cosine distance, Euclidean distance, correlation distance, etc., for example.
    • 低功率声音识别传感器被配置为接收可能包含签名声音的模拟信号。 从模拟信号中提取稀疏声音参数信息。 提取的稀疏声音参数信息使用存储在声音识别传感器中的与扬声器相关的声音签名数据库来处理,以识别包含在模拟信号中的声音或语音。 声音签名数据库可以包括用于每个表示整个单词或多个词短语的声音命令的几个用户注册。 提取的稀疏声音参数信息可以使用例如余弦距离,欧氏距离,相关距离等与多个用户登记的签名进行比较。
    • 84. 发明申请
    • Method for Segmenting Videos and Audios into Clips Using Speaker Recognition
    • 使用扬声器识别将视频和音频分割成片段的方法
    • US20150051912A1
    • 2015-02-19
    • US14456725
    • 2014-08-11
    • Chunghwa Telecom Co., Ltd.
    • Chun-Lin WANGChi-Shi LIUChih-Jung LIN
    • G10L17/00G11B27/30
    • G10L17/00G10L17/04G10L17/08
    • A method for segmenting video and audio into clips using speaker recognition is provided to segment audio according to speaker audio, and to make audio clips correspond to the audio and video signals to generate audio and video clips. The method instantly trains an independent speaker model by increasing an unknown speaker source audio signal, and the speaker recognition result is applied to determine the audio and video clips. Independent speaker clips of source audio are determined according to the speaker model and the speaker model is renewed according the independent speaker clips of source audio. This method segments audio by the speaker model without waiting for complete speaker feature audio signals to be collected. The method is also able to segment the audio and video into clips based on the recognition result of speaker audio, and can be used to segment TV audio and video into clips.
    • 提供使用说话人识别将视频和音频分割成片段的方法,以根据扬声器音频分割音频,并使音频片段对应于音频和视频信号以产生音频和视频剪辑。 该方法通过增加未知的扬声器源音频信号立即训练独立的扬声器模型,并且应用扬声器识别结果来确定音频和视频剪辑。 源音频的独立扬声器剪辑根据扬声器模型确定,扬声器模型根据源音频的独立扬声器剪辑进行更新。 该方法通过扬声器模型分割音频,而不需要等待收集完整的扬声器特征音频信号。 该方法还可以根据扬声器音频的识别结果将音频和视频分割成剪辑,并可用于将电视音频和视频分割成剪辑。
    • 85. 发明申请
    • SYSTEM AND METHOD FOR DETECTING SYNTHETIC SPEAKER VERIFICATION
    • 用于检测合成扬声器验证的系统和方法
    • US20140350938A1
    • 2014-11-27
    • US14454104
    • 2014-08-07
    • AT&T Intellectual Property I, L.P.
    • Horst J. Schroeter
    • G10L17/20G10L17/24G10L17/00
    • G10L17/24G10L17/005G10L17/04G10L17/08G10L17/20
    • Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    • 本文公开了用于检测合成说话人验证的系统,方法和有形计算机可读介质。 所述方法包括:接收用于验证的相同单词或短语的多个语音样本,将所述多个语音样本中的每一个相互比较,拒绝验证所述多个语音样本是否随时间表现出很小的变化或相同,并验证 多个语音样本如果多个语音样本显示出随时间的足够的方差。 一个实施例进一步补充说,在不同的时间或在不同的上下文中收集多个语音样本中的每一个。 在其他实施例中,方差基于预定阈值,或者基于对认证确定性的需要来调整方差阈值。 在另一个实施例中,如果初始比较是不确定的,则接收附加语音样本。
    • 86. 发明申请
    • METHOD AND DEVICE FOR VOICEPRINT RECOGNITION
    • 用于VOICEPRINT识别的方法和装置
    • US20140214417A1
    • 2014-07-31
    • US14105110
    • 2013-12-12
    • Tencent Technology (Shenzhen) Company Limited
    • Eryu WANGLi LUXiang ZHANGHaibo LIULou LIFeng RAODuling LUShuai YUEBo CHEN
    • G10L17/00
    • G10L17/18G10L17/02G10L17/04G10L17/08
    • A method and device for voiceprint recognition, include: establishing a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data; obtaining a plurality of high-level voiceprint features by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, and the tuning producing a second-level DNN model specifying the plurality of high-level voiceprint features; based on the second-level DNN model, registering a respective high-level voiceprint feature sequence for a user based on a registration speech sample received from the user; and performing speaker verification for the user based on the respective high-level voiceprint feature sequence registered for the user.
    • 用于声纹识别的方法和装置包括:基于未标记的语音数据建立第一级深神经网络(DNN)模型,不包含扬声器标签的未标记语音数据和指定多个基本声纹特征的第一级DNN模型 对于未标记的语音数据; 通过基于标记的语音数据调整第一级DNN模型来获得多个高级声纹特征,所述标记语音数据包含具有相应扬声器标签的语音样本,并且调谐产生指定多个高的DNN模型 级的声纹特征; 基于第二级DNN模型,基于从用户接收到的注册语音样本,为用户注册相应的高级声纹特征序列; 以及基于为用户注册的各个高级声纹特征序列,为用户执行说话人验证。
    • 88. 发明申请
    • CHANNEL DETECTION IN NOISE USING SINGLE CHANNEL DATA
    • 使用单通道数据进行通道检测
    • US20130275128A1
    • 2013-10-17
    • US13804947
    • 2013-03-14
    • Heiko ClaussenJustinian Rosca
    • Heiko ClaussenJustinian Rosca
    • G10L15/20
    • G10L15/20G10L17/08G10L25/27G10L25/78
    • Methods related to Generalized Mutual Interdependence Analysis (GMIA), a low complexity statistical method for projecting data in a subspace that captures invariant properties of the data, are implemented on a processor based system. GMIA methods are applied to the signal processing problem of voice activity detection and classification. Real-world conversational speech data are modeled to fit the GMIA assumptions. Low complexity GMIA computations extract reliable features for classification of sound under noisy conditions and operate with small amounts of data. A speaker is characterized by a slow varying or invariant channel that is learned and is tracked from single channel data by GMIA methods.
    • 与广义相互依赖关系分析(GMIA)相关的方法是在基于处理器的系统上实现的一种低复杂度统计方法,用于在捕获数据不变属性的子空间中投影数据。 GMIA方法应用于语音活动检测和分类的信号处理问题。 现实世界对话语音数据被建模以适应GMIA假设。 低复杂度的GMIA计算提取了可靠的特征,用于在嘈杂条件下对声音进行分类,并使用少量数据进行操作。 扬声器的特征在于慢速变化或不变的通道,通过GMIA方法从单通道数据中学习并跟踪。
    • 90. 发明授权
    • Speaker selection based at least on an acoustic feature value similar to that of an utterance speaker
    • 扬声器选择至少基于与发音扬声器类似的声学特征值
    • US08452596B2
    • 2013-05-28
    • US12593414
    • 2008-02-29
    • Masahiro TaniTadashi EmoriYoshifumi Onishi
    • Masahiro TaniTadashi EmoriYoshifumi Onishi
    • G10L17/00
    • G10L17/08
    • To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment, a long-time speaker score is calculated (log likelihood of each of a plurality of speaker models stored in a speaker model storage with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and a short-time speaker score is calculated based on a short-time utterance, for example. Speakers are selected corresponding to a predetermined number of speaker models having a high long-time speaker score. Speakers are selected corresponding to the speaker models, the number of which is smaller than the predetermined number and the short-time speaker sore of which is high, from among the speakers having a high long-time speaker score.
    • 为了使扬声器的声学特征值与发声扬声器的声学特征值类似,具有准确性和稳定性,同时即使当扬声器的声学特征值每瞬间改变时也适应变化,长时间的说话者得分 例如,基于任意数量的话语计算(相对于声学特征值存储在扬声器模型存储器中的多个说话者模型中的每个的每个的对数似然),并且基于 短时间的话语例如。 对应于具有高长时间讲话者得分的预定数量的扬声器模型来选择扬声器。 从具有高长时间讲话者得分的扬声器中,对应于扬声器型号的扬声器,其扬声器型号小于预定数量,短时间扬声器的音高较高。