专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

71. 发明授权

US08175874B2 Personalized voice activity detection 有权
标题翻译：个性化语音活动检测
公开(公告)号：US08175874B2
公开(公告)日：2012-05-08
申请号：US12092578
申请日：2006-07-18
申请人： Shaul Shimhi
发明人： Shaul Shimhi
IPC分类号： G10L15/10
CPC分类号： G10L25/78 , G10L17/00
摘要： A method of transferring a real-time audio signal transmission, including: registering voice patterns (or other characteristics) of on more users to be used to identify the voices of the users, accepting an audio signal as it is created as a sequence of segments, analyzing each segment of the accepted audio signal to determine if it contains voice activity (314), determining a probability level that the voice activity of the segment is of a registered user (320 & 322); and selectively transferring the contents, of a segment responsive to the determined probability level (324).
摘要翻译：一种传送实时音频信号传输的方法，包括：登记用于识别用户的语音的更多用户的语音模式（或其他特征），接收音频信号作为片段序列分析所接收的音频信号的每个片段以确定其是否包含语音活动（314），确定所述片段的语音活动是注册用户（320＆322）的概率级别; 以及响应于所确定的概率水平选择性地传送段的内容（324）。

72. 发明授权

US08170873B1 Comparing events in word spotting 有权
标题翻译：比较单词发现的事件
公开(公告)号：US08170873B1
公开(公告)日：2012-05-01
申请号：US10897056
申请日：2004-07-22
申请人： Robert W. Morris
发明人： Robert W. Morris
IPC分类号： G10L15/10 , G10L15/06 , G10L15/12 , G10L15/14 , G10L15/00 , G10L15/04
CPC分类号： G10L15/08 , G10L15/187 , G10L2015/088
摘要： An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.
摘要翻译：比较词识别中的事件（例如比较关键词的推定和参考实例）的方法利用一组子单元的模型。对于两个声学事件中的每一个以及在每个事件中的一系列次数中的每一个，计算与该组子词单元的每个模型相关联的概率。然后，使用与每个模型相关联的计算概率来计算表征两个声学事件的比较的量，一个出现在两个声学信号中的每一个中。

73. 发明授权

US07363222B2 Method for searching data in at least two databases 失效
标题翻译：用于在至少两个数据库中搜索数据的方法
公开(公告)号：US07363222B2
公开(公告)日：2008-04-22
申请号：US10482517
申请日：2002-06-24
申请人： Michael Josenhans
发明人： Michael Josenhans
IPC分类号： G10L15/10 , G10L15/28 , G06F7/06
CPC分类号： H04M1/271 , G10L15/26 , H04M1/275 , H04M2250/02 , Y10S707/99933
摘要： A method and database system is disclosed for searching data in at least two databases (Dn), particularly for searching telephone directories or the like. To allow simultaneous access to two or more databases by means of speech recognition in order to perform a search therein as in a single database, a search term is input by speech via a voice controlled user interface (28) connected to a database primary control apparatus (26) and comprises speech recognition front end means (8, 9) for processing a sound sequence of a search term input by speech to obtain a comparable speech pattern (X) thereof. By means of speech recognition back end means (6) associated with databases (D1-D6), the comparable speech pattern (X) is compared with corresponding speech patterns (An,i) of database entries (En,i) to determine for each of the at least two databases (Dn) at least that database entry (En,j) the speech pattern (An,j) which best matches the comparable speech pattern (X) of the search term.
摘要翻译：公开了用于在至少两个数据库（D SUB）中搜索数据的方法和数据库系统，特别是用于搜索电话簿等。为了允许通过语音识别同时访问两个或更多个数据库，以便像在单个数据库中那样执行搜索，搜索项通过语音控制用户界面（28）输入，所述语音控制用户界面（28）连接到数据库主控制装置（26），并且包括语音识别前端装置（8,9），用于处理通过语音输入的搜索项的声音序列，以获得其可比较的语音模式（X）。通过与数据库（D 1 -D 6）相关联的语音识别后端装置（6），将可比较的语音模式（X）与相应的语音模式（数据库条目（E> n，i>）中的每个数据库（D n n at at at at at at at at at at at at at at at at at at）该数据库条目（E N，j，N）最好与搜索项的可比较语言模式（X）匹配的语音模式（A N n，j N）。

74. 发明授权

US07310600B1 Language recognition using a similarity measure 失效
标题翻译：使用相似性度量语言识别
公开(公告)号：US07310600B1
公开(公告)日：2007-12-18
申请号：US09695077
申请日：2000-10-25
申请人： Philip Neil Garner , Jason Peter Andrew Charlesworth , Asako Higuchi
发明人： Philip Neil Garner , Jason Peter Andrew Charlesworth , Asako Higuchi
IPC分类号： G10L15/08 , G10L15/10
CPC分类号： G10L15/12 , G10L2015/025
摘要： A dynamic programming technique is provided for matching two sequences of phonemes both of which may be generated from text or speech. The scoring of the dynamic programming matching technique uses phoneme confusion scores, phoneme insertion scores and phoneme deletion scores which are obtained in advance in a training session and, if appropriate, confidence data generated by a recognition system if the sequences are generated from speech.
摘要翻译：提供了一种动态编程技术，用于匹配两个可以从文本或语音生成的音素序列。动态编程匹配技术的得分使用在训练课程中预先获得的音素混淆分数，音素插入分数和音素删除分数，以及如果适当的话，由识别系统生成的置信度数据，如果序列是从语音产生的。

75. 发明授权

US07266495B1 Method and system for learning linguistically valid word pronunciations from acoustic data 有权
标题翻译：从声学数据学习语言有效的单词发音的方法和系统
公开(公告)号：US07266495B1
公开(公告)日：2007-09-04
申请号：US10661106
申请日：2003-09-12
申请人： Francoise Beaufays , Ananth Sankar , Mitchel Weintraub , Shaun Williams
发明人： Francoise Beaufays , Ananth Sankar , Mitchel Weintraub , Shaun Williams
IPC分类号： G10L15/06 , G10L15/10
CPC分类号： G10L15/06 , G10L15/187
摘要： A computerized pronunciation system is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The system includes a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including: sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.
摘要翻译：提供了一种计算机化的发音系统，用于产生词的发音并将发音存储在发音词典中。该系统包括包括至少一个单词的单词列表; 转录声学数据，包括用于该词的至少一个波形和与波形相关联的转录文本; 发音学习模块，被配置为接受单词列表和转录声学数据的输入，所述发音学习模块包括：该单词的初始发音集，评分模块配置得分发音并产生电话概率，以及一组该单词的替代发音，其中该组交替发音包括最高得分的初始发音集合，其中替代最低概率电话的最高评分替代电话; 和发音词典，其配置为接收最高分的初始发音和一组交替发音。

76. 发明授权

US07236930B2 Method to extend operating range of joint additive and convolutive compensating algorithms 有权
标题翻译：扩展联合加法和卷积补偿算法的操作范围的方法
公开(公告)号：US07236930B2
公开(公告)日：2007-06-26
申请号：US10822319
申请日：2004-04-12
申请人： Alexis P. Bernard , Yifan Gong
发明人： Alexis P. Bernard , Yifan Gong
IPC分类号： G10L15/10
CPC分类号： G10L21/0208 , G10L15/10 , G10L15/142 , G10L15/20
摘要： The operating range of joint additive and convolutive compensating method is extended by enhanced channel estimation procedure that adds SNR-dependent inertia and SNR-dependent limit on the channel estimate.
摘要翻译：联合添加和卷积补偿方法的工作范围通过增加的信道估计过程得到扩展，这增加了信道估计的依赖于SNR的惯性和SNR依赖的限制。

77. 发明授权

US06985858B2 Method and apparatus for removing noise from feature vectors 失效
公开(公告)号：US06985858B2
公开(公告)日：2006-01-10
申请号：US09812524
申请日：2001-03-20
申请人： Brendan J. Frey , Alejandro Acero , Li Deng
发明人： Brendan J. Frey , Alejandro Acero , Li Deng
IPC分类号： G10L15/20 , G10L15/10
CPC分类号： G10L15/02 , G10L15/20
摘要： A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. The method is based on variational inference techniques. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Further aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Additional aspects of the invention include using a variance for the noisy signal feature vector conditioned on fixed values of noise, channel transfer function, and clean speech, when identifying the clean signal feature vector.

78. 发明授权

US06973256B1 System and method for detecting highlights in a video program using audio properties 失效
标题翻译：使用音频属性检测视频节目中的高光的系统和方法
公开(公告)号：US06973256B1
公开(公告)日：2005-12-06
申请号：US09699605
申请日：2000-10-30
申请人： Serhan Dagtas
发明人： Serhan Dagtas
IPC分类号： G06F17/30 , G10L11/02 , G10L15/00 , G10L15/04 , G10L15/08 , G10L15/10 , G11B27/10 , G11B27/28 , H04N5/76 , H04N5/781 , H04N5/783 , H04N5/93 , H04N9/804 , H04N5/91
CPC分类号： H04N5/76 , G06F17/30787 , G06F17/30796 , G11B27/105 , G11B27/28 , G11B2220/216 , G11B2220/2562 , H04N5/781 , H04N5/783 , H04N9/8042
摘要： There is disclosed an apparatus for detecting program highlights in a video program. The apparatus comprises: 1) a keyword detection circuit for detecting a location of a selected keyword in an audio track of the video program; and 2) an audio signal energy level detection circuit for determining an audio signal energy level of the audio track proximate the detected location of the selected keyword and comparing the audio signal energy level to a predetermined threshold. The audio signal energy level detection circuit, in response to a determination that the audio signal energy level exceeds the predetermined threshold, identifies the detected location of the selected keyword as a program highlight.
摘要翻译：公开了一种用于检测视频节目中的节目亮点的装置。该装置包括：1）关键词检测电路，用于检测所选择的关键词在视频节目的音轨中的位置; 以及2）音频信号能量电平检测电路，用于确定邻近检测到的所选择的关键词的位置的音频轨道的音频信号能级，并将音频信号能级与预定阈值进行比较。音频信号能量电平检测电路响应于音频信号能级超过预定阈值的确定，将检测到的所选择的关键字的位置识别为节目高亮。

79. 发明申请

US20050256712A1 Speech recognition device and speech recognition method 有权
标题翻译：语音识别装置和语音识别方法
公开(公告)号：US20050256712A1
公开(公告)日：2005-11-17
申请号：US10504926
申请日：2004-02-04
申请人： Maki Yamada , Makoto Nishizaki , Yoshihisa Nakatoh , Shinichi Yoshizawa
发明人： Maki Yamada , Makoto Nishizaki , Yoshihisa Nakatoh , Shinichi Yoshizawa
IPC分类号： G06F3/16 , G10L11/02 , G10L15/00 , G10L15/04 , G10L15/06 , G10L15/08 , G10L15/10 , G10L15/14 , G10L15/18 , G10L15/20 , G10L15/22 , G10L15/28
CPC分类号： G10L15/065 , G10L15/08
摘要： The speech recognition apparatus (1) is equipped with the garbage acoustic model storage unit (110) storing the garbage acoustic model which learned the collection of the unnecessary words; the feature value calculation unit (101) which calculates the feature parameter necessary for recognition by acoustically analyzing the unidentified input speech including the non-language speech per frame which is a unit for speech analysis; the garbage acoustic score calculation unit (111) which calculates the garbage acoustic score by comparing the feature parameter and the garbage acoustic model; the garbage acoustic score correction unit (113) which corrects the garbage acoustic score calculated by the garbage acoustic score calculation unit (111) so as to raise it in the frame where the non-language speech is inputted; and the recognition result output unit (105) which outputs, as the recognition result of the unidentified input speech, the word string with the highest cumulative score of the language score, the word acoustic score, and the garbage acoustic score which is corrected by the garbage acoustic score correcting means.
摘要翻译：语音识别装置（1）配备有存储无用声音模型的垃圾声模型存储单元（110），该垃圾声模型学习了不必要的字的收集; 特征值计算单元（101），其通过声学分析包括作为用于语音分析的单位的每个非语言语音的未识别输入语音来计算识别所需的特征参数; 所述垃圾声分数计算部（111）通过比较所述特征参数和所述垃圾声模型来计算所述无声声分数; 所述垃圾声音得分校正单元（113）对由所述垃圾声音乐评分计算单元（111）计算出的所述无声声音乐评分进行校正，以提高其输入所述非语言语音的帧; 以及识别结果输出单元（105），作为未识别输入语音的识别结果，输出具有语言得分的最高累积分数，词声分数和垃圾声音得分的单词串，其由垃圾声分数校正装置。

80. 发明申请

US20050256710A1 Text message generation 审中-公开
标题翻译：短信生成
公开(公告)号：US20050256710A1
公开(公告)日：2005-11-17
申请号：US10507194
申请日：2003-03-10
申请人： Matthias Pankert , Reimund Schmald , Jens Marschner
发明人： Matthias Pankert , Reimund Schmald , Jens Marschner
IPC分类号： G06F3/16 , G10L15/08 , G10L15/10 , G10L15/19 , G10L15/22 , G10L15/26
CPC分类号： G10L15/19
摘要： The invention relates to a method of generating text messages. In order to make the generation of text messages as convenient and efficient as possible for a user, the following steps are proposed: —processing of speech input containing message elements by means of grammar-based speech recognition procedures; —processing of speech input by means of speech model-based speech recognition procedures, either in parallel with processing by means of grammar-based speech recognition or once a recognition result has been obtained by means of the grammar-based speech recognition procedures which is not of a predefined quality; —generation of a text message using the recognition results produced by means of the grammar-based and/or speech model-based speech recognition procedures.
摘要翻译：本发明涉及生成文本消息的方法。为了使用户对文本消息的生成尽可能方便和有效，提出了以下步骤：通过基于语法的语音识别程序来处理包含消息元素的语音输入; 通过基于语音模型的语音识别程序来处理语音输入，或者通过基于语法的语音识别的处理并行处理，或者一旦通过基于语法的语音识别程序获得了识别结果，该程序不是的预定质量; 使用通过基于语法和/或基于语音模型的语音识别程序产生的识别结果来生成文本消息。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式