专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

81. 发明申请

US20150187359A1 DIGITAL VOICE SIGNATURE OF TRANSACTIONS 有权
标题翻译：数字电话签名
公开(公告)号：US20150187359A1
公开(公告)日：2015-07-02
申请号：US14644129
申请日：2015-03-10
申请人： Ack3 Bionetics Pte Limited
发明人： Sajit Bhaskaran
IPC分类号： G10L17/24 , G10L15/26
CPC分类号： G10L17/24 , G07C9/00071 , G10L15/26 , G10L15/265 , G10L17/04 , G10L17/08
摘要： A method that includes receiving, by a server, an access request sent to a network address of a resource server from a user using a user device, the access request comprising a unique record identifier is provided. The method includes placing a call to the user device, receiving from the user a voice response to a prompt associated with an implied security question for the user, comparing the voice response of the user with a selected voice biometrics record, converting the voice response into a speech-to-text phrase, and comparing the speech-to-text phrase against a stored secret text phrase to verify that the speech-to-text phrase matches an answer to the silent security question. A method for signing a transaction, including collecting a plurality of voice samples from a user during a transaction and concatenating the plurality of voice samples into a single sound file is also provided.
摘要翻译：一种方法，其包括由服务器接收从用户使用用户设备发送到资源服务器的网络地址的访问请求，提供包括唯一记录标识符的访问请求。该方法包括向用户设备发出呼叫，从用户接收对与用户的隐含安全问题相关联的提示的语音响应，将用户的语音响应与所选择的语音生物测定记录进行比较，将语音响应转换成语音到文本短语，以及将语音到文本短语与存储的秘密文本短语进行比较，以验证语音到文本短语与静音安全问题的答案相匹配。还提供了一种用于签署交易的方法，包括在交易期间从用户收集多个语音样本并将多个语音样本连接成单个声音文件。

82. 发明申请

US20150088498A1 LOW LATENCY REAL-TIME VOCAL TRACT LENGTH NORMALIZATION 有权
公开(公告)号：US20150088498A1
公开(公告)日：2015-03-26
申请号：US14554339
申请日：2014-11-26
申请人： AT&T INTELLECTUAL PROPERTY II, L.P.
发明人： Vincent GOFFIN , Andrej LJOLJE , Murat Saraclar
IPC分类号： G10L15/06
CPC分类号： G10L15/063 , G10L15/10 , G10L15/12 , G10L17/04 , G10L17/08
摘要： A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.

83. 发明申请

US20150073795A1 User Programmable Voice Command Recognition Based On Sparse Features 有权
标题翻译：基于稀疏特征的用户可编程语音指令识别
公开(公告)号：US20150073795A1
公开(公告)日：2015-03-12
申请号：US14458688
申请日：2014-08-13
申请人： Texas Instruments Incorporated
发明人： Bozhao Tan
IPC分类号： G10L15/06
CPC分类号： G10L17/04 , G10L15/02 , G10L15/063 , G10L17/02 , G10L17/08 , G10L17/22 , G10L25/09 , G10L25/18
摘要： A low power sound recognition sensor is configured to receive an analog signal that may contain a signature sound. Sparse sound parameter information is extracted from the analog signal. The extracted sparse sound parameter information is processed using a speaker dependent sound signature database stored in the sound recognition sensor to identify sounds or speech contained in the analog signal. The sound signature database may include several user enrollments for a sound command each representing an entire word or multiword phrase. The extracted sparse sound parameter information may be compared to the multiple user enrolled signatures using cosine distance, Euclidean distance, correlation distance, etc., for example.
摘要翻译：低功率声音识别传感器被配置为接收可能包含签名声音的模拟信号。从模拟信号中提取稀疏声音参数信息。提取的稀疏声音参数信息使用存储在声音识别传感器中的与扬声器相关的声音签名数据库来处理，以识别包含在模拟信号中的声音或语音。声音签名数据库可以包括用于每个表示整个单词或多个词短语的声音命令的几个用户注册。提取的稀疏声音参数信息可以使用例如余弦距离，欧氏距离，相关距离等与多个用户登记的签名进行比较。

84. 发明申请

US20150051912A1 Method for Segmenting Videos and Audios into Clips Using Speaker Recognition 审中-公开
标题翻译：使用扬声器识别将视频和音频分割成片段的方法
公开(公告)号：US20150051912A1
公开(公告)日：2015-02-19
申请号：US14456725
申请日：2014-08-11
申请人： Chunghwa Telecom Co., Ltd.
发明人： Chun-Lin WANG , Chi-Shi LIU , Chih-Jung LIN
IPC分类号： G10L17/00 , G11B27/30
CPC分类号： G10L17/00 , G10L17/04 , G10L17/08
摘要： A method for segmenting video and audio into clips using speaker recognition is provided to segment audio according to speaker audio, and to make audio clips correspond to the audio and video signals to generate audio and video clips. The method instantly trains an independent speaker model by increasing an unknown speaker source audio signal, and the speaker recognition result is applied to determine the audio and video clips. Independent speaker clips of source audio are determined according to the speaker model and the speaker model is renewed according the independent speaker clips of source audio. This method segments audio by the speaker model without waiting for complete speaker feature audio signals to be collected. The method is also able to segment the audio and video into clips based on the recognition result of speaker audio, and can be used to segment TV audio and video into clips.
摘要翻译：提供使用说话人识别将视频和音频分割成片段的方法，以根据扬声器音频分割音频，并使音频片段对应于音频和视频信号以产生音频和视频剪辑。该方法通过增加未知的扬声器源音频信号立即训练独立的扬声器模型，并且应用扬声器识别结果来确定音频和视频剪辑。源音频的独立扬声器剪辑根据扬声器模型确定，扬声器模型根据源音频的独立扬声器剪辑进行更新。该方法通过扬声器模型分割音频，而不需要等待收集完整的扬声器特征音频信号。该方法还可以根据扬声器音频的识别结果将音频和视频分割成剪辑，并可用于将电视音频和视频分割成剪辑。

85. 发明申请

US20140350938A1 SYSTEM AND METHOD FOR DETECTING SYNTHETIC SPEAKER VERIFICATION 有权
标题翻译：用于检测合成扬声器验证的系统和方法
公开(公告)号：US20140350938A1
公开(公告)日：2014-11-27
申请号：US14454104
申请日：2014-08-07
申请人： AT&T Intellectual Property I, L.P.
发明人： Horst J. Schroeter
IPC分类号： G10L17/20 , G10L17/24 , G10L17/00
CPC分类号： G10L17/24 , G10L17/005 , G10L17/04 , G10L17/08 , G10L17/20
摘要： Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
摘要翻译：本文公开了用于检测合成说话人验证的系统，方法和有形计算机可读介质。所述方法包括：接收用于验证的相同单词或短语的多个语音样本，将所述多个语音样本中的每一个相互比较，拒绝验证所述多个语音样本是否随时间表现出很小的变化或相同，并验证多个语音样本如果多个语音样本显示出随时间的足够的方差。一个实施例进一步补充说，在不同的时间或在不同的上下文中收集多个语音样本中的每一个。在其他实施例中，方差基于预定阈值，或者基于对认证确定性的需要来调整方差阈值。在另一个实施例中，如果初始比较是不确定的，则接收附加语音样本。

86. 发明申请

US20140214417A1 METHOD AND DEVICE FOR VOICEPRINT RECOGNITION 有权
标题翻译：用于VOICEPRINT识别的方法和装置
公开(公告)号：US20140214417A1
公开(公告)日：2014-07-31
申请号：US14105110
申请日：2013-12-12
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Eryu WANG , Li LU , Xiang ZHANG , Haibo LIU , Lou LI , Feng RAO , Duling LU , Shuai YUE , Bo CHEN
IPC分类号： G10L17/00
CPC分类号： G10L17/18 , G10L17/02 , G10L17/04 , G10L17/08
摘要： A method and device for voiceprint recognition, include: establishing a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data; obtaining a plurality of high-level voiceprint features by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, and the tuning producing a second-level DNN model specifying the plurality of high-level voiceprint features; based on the second-level DNN model, registering a respective high-level voiceprint feature sequence for a user based on a registration speech sample received from the user; and performing speaker verification for the user based on the respective high-level voiceprint feature sequence registered for the user.
摘要翻译：用于声纹识别的方法和装置包括：基于未标记的语音数据建立第一级深神经网络（DNN）模型，不包含扬声器标签的未标记语音数据和指定多个基本声纹特征的第一级DNN模型对于未标记的语音数据; 通过基于标记的语音数据调整第一级DNN模型来获得多个高级声纹特征，所述标记语音数据包含具有相应扬声器标签的语音样本，并且调谐产生指定多个高的DNN模型级的声纹特征; 基于第二级DNN模型，基于从用户接收到的注册语音样本，为用户注册相应的高级声纹特征序列; 以及基于为用户注册的各个高级声纹特征序列，为用户执行说话人验证。

87. 发明申请

US20140114660A1 Method and Device for Speaker Recognition 有权
标题翻译：扬声器识别方法和装置
公开(公告)号：US20140114660A1
公开(公告)日：2014-04-24
申请号：US14145318
申请日：2013-12-31
申请人： Huawei Technologies Co., Ltd.
发明人： Xiang Zhang , Hualin Wan , Jun Zhang
IPC分类号： G10L15/14 , G10L17/00
CPC分类号： G10L15/14 , G10L17/00 , G10L17/04 , G10L17/06 , G10L17/08 , G10L17/10
摘要： A method and device for speaker recognition are provided. In the present invention, identifiability re-estimation is performed on a first vector (namely, a weight vector) in a score function by adopting a support vector machine (SVM), so that a recognition result of a characteristic parameter of a test voice is more accurate, thereby improving identifiability of speaker recognition.
摘要翻译：提供了用于说话者识别的方法和装置。在本发明中，通过采用支持向量机（SVM）对得分函数中的第一向量（即权重向量）进行可识别性再估计，使得测试语音的特征参数的识别结果为更准确，从而提高说话人识别的识别能力。

88. 发明申请

US20130275128A1 CHANNEL DETECTION IN NOISE USING SINGLE CHANNEL DATA 有权
标题翻译：使用单通道数据进行通道检测
公开(公告)号：US20130275128A1
公开(公告)日：2013-10-17
申请号：US13804947
申请日：2013-03-14
申请人： Heiko Claussen , Justinian Rosca
发明人： Heiko Claussen , Justinian Rosca
IPC分类号： G10L15/20
CPC分类号： G10L15/20 , G10L17/08 , G10L25/27 , G10L25/78
摘要： Methods related to Generalized Mutual Interdependence Analysis (GMIA), a low complexity statistical method for projecting data in a subspace that captures invariant properties of the data, are implemented on a processor based system. GMIA methods are applied to the signal processing problem of voice activity detection and classification. Real-world conversational speech data are modeled to fit the GMIA assumptions. Low complexity GMIA computations extract reliable features for classification of sound under noisy conditions and operate with small amounts of data. A speaker is characterized by a slow varying or invariant channel that is learned and is tracked from single channel data by GMIA methods.
摘要翻译：与广义相互依赖关系分析（GMIA）相关的方法是在基于处理器的系统上实现的一种低复杂度统计方法，用于在捕获数据不变属性的子空间中投影数据。 GMIA方法应用于语音活动检测和分类的信号处理问题。现实世界对话语音数据被建模以适应GMIA假设。低复杂度的GMIA计算提取了可靠的特征，用于在嘈杂条件下对声音进行分类，并使用少量数据进行操作。扬声器的特征在于慢速变化或不变的通道，通过GMIA方法从单通道数据中学习并跟踪。

89. 发明授权

US08489397B2 Method and device for providing speech-to-text encoding and telephony service 有权
标题翻译：用于提供语音到文本编码和电话服务的方法和设备
公开(公告)号：US08489397B2
公开(公告)日：2013-07-16
申请号：US13609918
申请日：2012-09-11
申请人： Charles David Caldwell , John Bruce Harlow , Robert J. Sayko , Norman Shaye
发明人： Charles David Caldwell , John Bruce Harlow , Robert J. Sayko , Norman Shaye
IPC分类号： G10L15/00 , G10L15/26 , G10L15/06 , G10L17/00
CPC分类号： G10L15/26 , G10L15/265 , G10L17/08 , G10L17/26 , G10L19/0018 , H04M1/2475 , H04M1/2535 , H04M1/57 , H04M11/066 , H04M2250/74 , H04N21/4394 , H04N21/440236 , H04N21/4788
摘要： A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.
摘要翻译：提供机器可读介质和网络设备用于语音到文本翻译。在宽带电话接口处接收语音分组并存储在缓冲器中。语音分组被处理，其文本表示在显示设备上被显示为单词。响应于来自订户的命令激活和去激活语音处理。

90. 发明授权

US08452596B2 Speaker selection based at least on an acoustic feature value similar to that of an utterance speaker 有权
标题翻译：扬声器选择至少基于与发音扬声器类似的声学特征值
公开(公告)号：US08452596B2
公开(公告)日：2013-05-28
申请号：US12593414
申请日：2008-02-29
申请人： Masahiro Tani , Tadashi Emori , Yoshifumi Onishi
发明人： Masahiro Tani , Tadashi Emori , Yoshifumi Onishi
IPC分类号： G10L17/00
CPC分类号： G10L17/08
摘要： To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment, a long-time speaker score is calculated (log likelihood of each of a plurality of speaker models stored in a speaker model storage with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and a short-time speaker score is calculated based on a short-time utterance, for example. Speakers are selected corresponding to a predetermined number of speaker models having a high long-time speaker score. Speakers are selected corresponding to the speaker models, the number of which is smaller than the predetermined number and the short-time speaker sore of which is high, from among the speakers having a high long-time speaker score.
摘要翻译：为了使扬声器的声学特征值与发声扬声器的声学特征值类似，具有准确性和稳定性，同时即使当扬声器的声学特征值每瞬间改变时也适应变化，长时间的说话者得分例如，基于任意数量的话语计算（相对于声学特征值存储在扬声器模型存储器中的多个说话者模型中的每个的每个的对数似然），并且基于短时间的话语例如。对应于具有高长时间讲话者得分的预定数量的扬声器模型来选择扬声器。从具有高长时间讲话者得分的扬声器中，对应于扬声器型号的扬声器，其扬声器型号小于预定数量，短时间扬声器的音高较高。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式