专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06684185B1 Small footprint language and vocabulary independent word recognizer using registration by word spelling 有权
标题翻译：小字体语言和词汇独立的词识别器使用注册字拼写
公开(公告)号：US06684185B1
公开(公告)日：2004-01-27
申请号：US09148579
申请日：1998-09-04
申请人： Jean-Claude Junqua , Ted Applebaum , Roland Kuhn
发明人： Jean-Claude Junqua , Ted Applebaum , Roland Kuhn
IPC分类号： G10L1506
CPC分类号： G10L15/26 , G10L15/063 , G10L2015/088
摘要： A phoneticizer converts spelled words or names into one or an n-best number of phonetic transcriptions. The n-best transcriptions may be generated from a single transcription using a confusion matrix. These n-best transcriptions are then transformed into hybrid units. Preferably only the most frequently encountered units are stored as syllables, with the remainder being stored as smaller units such as demi-syllables or phonemes. Voice input is then used to rescore the n-best transcriptions and these are stored preferably as speaker-independent, similarity-based hybrid units concatenated into a string representing the spelled word.
摘要翻译：拼音者将拼写的单词或名称转换为一个或最多数量的语音转录。可以使用混淆矩阵从单个转录产生n个最佳转录。然后将这些n个最佳转录转化成混合单位。优选地，只有最常遇到的单元被存储为音节，其余的被存储为更小的单位，例如音节或音素。然后，语音输入用于重新排列n个最佳的转录，并且这些被优选地存储为与说明者无关的，基于相似度的混合单元连接成表示拼写单词的字符串。

2. 发明授权

US07103553B2 Assistive call center interface 失效
标题翻译：辅助呼叫中心接口
公开(公告)号：US07103553B2
公开(公告)日：2006-09-05
申请号：US10454716
申请日：2003-06-04
申请人： Ted Applebaum , Jean-Claude Junqua
发明人： Ted Applebaum , Jean-Claude Junqua
IPC分类号： G10L15/00
CPC分类号： G10L15/1822
摘要： Unstructured voice information from an incoming caller is processed by automatic speech recognition and semantic categorization system to convert the information into structured data that may then be used to access one or more databases to retrieve associated supplemental data. The structured data and associated supplemental data are then made available through a presentation system that provides information to the call center agent and, optionally, to the incoming caller. The system thus allows a call center information processing system to handle unstructured voice input for use by the live agent in handling the incoming call and for storage and retrieval at a later time. The semantic analysis system may be implemented by a global parser or by an information retrieval technique, such as latent semantic analysis. Co-occurrence of keywords may be used to associate prior calls with an incoming call to assist in understanding the purpose of the incoming call.
摘要翻译：来自呼叫者的非结构化语音信息由自动语音识别和语义分类系统处理，以将信息转换成结构化数据，然后可以用于访问一个或多个数据库以检索相关联的补充数据。结构化数据和相关的补充数据然后通过向呼叫中心代理提供信息并且可选地提供给传入呼叫者的呈现系统可用。因此，该系统允许呼叫中心信息处理系统处理非结构化语音输入以供实时代理使用以处理来话呼叫并在以后的时间进行存储和检索。语义分析系统可以由全局解析器或诸如潜在语义分析之类的信息检索技术来实现。关键字的共现可以用于将先前的呼叫与呼入呼叫相关联，以帮助理解来话呼叫的目的。

3. 发明申请

US20050171774A1 Features and techniques for speaker authentication 审中-公开
标题翻译：扬声器认证的特点和技术
公开(公告)号：US20050171774A1
公开(公告)日：2005-08-04
申请号：US10768946
申请日：2004-01-30
申请人： Ted Applebaum , Steven Pearson , Philippe Morin , Jean-Claude Junqua
发明人： Ted Applebaum , Steven Pearson , Philippe Morin , Jean-Claude Junqua
IPC分类号： G10L15/00 , G10L17/00
CPC分类号： G10L17/12 , G10L17/06
摘要： A speaker authentication system includes an input receptive of user speech from a user. An extraction module extracts acoustic correlates of aspects of the user's physiology from the user speech, including at least one of glottal source parameters, formant related parameters, timing characteristics, and pitch related qualities. An output communicates the acoustic correlates to an authentication module adapted to authenticate the user by comparing the acoustic correlates to predefined acoustic correlates in a datastore.
摘要翻译：扬声器认证系统包括从用户接收用户语音的输入。提取模块从用户语音中提取用户生理方面的声学相关性，包括声门源参数，共振峰相关参数，时序特征和音调相关质量中的至少一个。输出通过将数据存储中的声相关与预定义的声相关进行比较来将声相关传递给适合于认证用户的认证模块。

4. 发明授权

US06513004B1 Optimized local feature extraction for automatic speech recognition 有权
标题翻译：优化局部特征提取自动语音识别
公开(公告)号：US06513004B1
公开(公告)日：2003-01-28
申请号：US09449053
申请日：1999-11-24
申请人： Luca Rigazio , David Kryze , Ted Applebaum , Jean-Claude Junqua
发明人： Luca Rigazio , David Kryze , Ted Applebaum , Jean-Claude Junqua
IPC分类号： G10L1504
CPC分类号： G10L15/02 , G10L25/27
摘要： The acoustic speech signal is decomposed into wavelets arranged in an asymmetrical tree data structure from which individual nodes may be selected to best extract local features, as needed to model specific classes of sound units. The wavelet packet transformation is smoothed through integration and compressed to apply a non-linearity prior to discrete cosine transformation. The resulting subband features such as cepstral coefficients may then be used to construct the speech recognizer's speech models. Using the local feature information extracted in this manner allows a single recognizer to be optimized for several different classes of sound units, thereby eliminating the need for parallel path recognizers.
摘要翻译：声学语音信号被分解成以不对称树数据结构排列的小波，根据需要可以选择各个节点以最佳地提取局部特征，以模拟特定类别的声音单元。小波包变换通过积分进行平滑，并进行压缩，以在离散余弦变换之前应用非线性。所得到的子带特征如倒谱系数可以用于构建语音识别器的语音模型。使用以这种方式提取的局部特征信息允许为多个不同类别的声音单元优化单个识别器，从而消除对并行路径识别器的需要。

5. 发明授权

US06895376B2 Eigenvoice re-estimation technique of acoustic models for speech recognition, speaker identification and speaker verification 有权
标题翻译：用于语音识别，扬声器识别和说话人验证的声学模型的本征语重新估计技术
公开(公告)号：US06895376B2
公开(公告)日：2005-05-17
申请号：US09849174
申请日：2001-05-04
申请人： Florent Perronnin , Roland Kuhn , Patrick Nguyen , Jean-Claude Junqua
发明人： Florent Perronnin , Roland Kuhn , Patrick Nguyen , Jean-Claude Junqua
IPC分类号： G10L15/06 , G10L17/00
CPC分类号： G10L15/07 , G10L17/02
摘要： A reduced dimensionality eigenvoice analytical technique is used during training to develop context-dependent acoustic models for allophones. Re-estimation processes are performed to more strongly separate speaker-dependent and speaker-independent components of the speech model. The eigenvoice technique is also used during run time upon the speech of a new speaker. The technique removes individual speaker idiosyncrasies, to produce more universally applicable and robust allophone models. In one embodiment the eigenvoice technique is used to identify the centroid of each speaker, which may then be “subtracted out” of the recognition equation.
摘要翻译：在训练期间使用减小的维度本征语音分析技术来开发用于异音素的上下文相关的声学模型。执行重新估计过程以更强烈地分离语音模型的与扬声器相关的和与扬声器无关的组件。特定语音技术在运行时也用于新演讲者的演讲。该技术可以消除单个扬声器的特性，从而产生更普遍适用和强大的异音模型。在一个实施例中，本征语音技术用于识别每个说话者的质心，然后可以将其“减去”识别方程。

6. 发明授权

US06341264B1 Adaptation system and method for E-commerce and V-commerce applications 有权
标题翻译：电子商务和电子商务应用的适应系统和方法
公开(公告)号：US06341264B1
公开(公告)日：2002-01-22
申请号：US09258113
申请日：1999-02-25
申请人： Roland Kuhn , Jean-Claude Junqua
发明人： Roland Kuhn , Jean-Claude Junqua
IPC分类号： G10L1528
CPC分类号： G06Q30/06 , G06Q30/0609 , G06Q50/188 , G10L15/07 , G10L15/26 , G10L17/00
摘要： Electronic commerce (E-commerce) and Voice commerce (V-commerce) proceeds by having the user speak into the system. The user's speech is converted by speech recognizer into a form required by the transaction processor that effects the electronic commerce operation. A dimensionality reduction processor converts the user's input speech into a reduced dimensionality set of values termed eigenvoice parameters. These parameters are compared with a set of previously stored eigenvoice parameters representing a speaker population (the eigenspace representing speaker space) and the comparison is used by the speech model adaptation system to rapidly adapt the speech recognizer to the user's speech characteristics. The user's eigenvoice parameters are also stored for subsequent use by the speaker verification and speaker identification modules.
摘要翻译：电子商务（电子商务）和语音商务（V-commerce）通过让用户进入系统进行。用户的语音由语音识别器转换成影响电子商务操作的交易处理器所需的形式。维数降低处理器将用户的输入语音转换成称为本征语音参数的减小的维度值集合。将这些参数与表示扬声器群体（表示扬声器空间的本征空间）的一组先前存储的本征语音参数进行比较，并且语音模型适配系统使用该比较来快速地将语音识别器适应于用户的语音特征。用户的本征语音参数也被存储供讲话人验证和说话者识别模块随后使用。

7. 发明授权

US06263309B1 Maximum likelihood method for finding an adapted speaker model in eigenvoice space 失效
标题翻译：在本征语音空间中找到适应的说话者模型的最大似然法
公开(公告)号：US06263309B1
公开(公告)日：2001-07-17
申请号：US09070054
申请日：1998-04-30
申请人： Patrick Nguyen , Roland Kuhn , Jean-Claude Junqua
发明人： Patrick Nguyen , Roland Kuhn , Jean-Claude Junqua
IPC分类号： G10L1508
CPC分类号： G10L15/07
摘要： A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principle component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.
摘要翻译：一组扬声器依赖模型训练在相对较多数量的训练扬声器上，每个扬声器一个模型和模型参数以预定义的顺序提取，以构建一组超级矢量，每个扬声器一个。然后在一组超级矢量上执行原理分量分析，以生成一组定义本征语音空间的特征向量。如果需要，可以减少向量的数量以实现数据压缩。此后，新的说话者提供了通过基于最大似然估计将该超向量限制在本征语音空间中来构建超向量的适配数据。然后，可以使用这个新的说话者的本征空间中得到的系数来构建一组新的模型参数，从该模型参数构建适合于该说话者的适应模型。可以通过在训练数据中包括环境变化来执行环境适应。

8. 发明授权

US06233561B1 Method for goal-oriented speech translation in hand-held devices using meaning extraction and dialogue 有权
标题翻译：使用意义提取和对话的手持设备中面向目标的语音翻译方法
公开(公告)号：US06233561B1
公开(公告)日：2001-05-15
申请号：US09290628
申请日：1999-04-12
申请人： Jean-Claude Junqua , Roland Kuhn , Matteo Contolini , Murat Karaorman , Ken Field , Michael Galler , Yi Zhao
发明人： Jean-Claude Junqua , Roland Kuhn , Matteo Contolini , Murat Karaorman , Ken Field , Michael Galler , Yi Zhao
IPC分类号： G10L1522
CPC分类号： G10L15/1822 , G10L15/1815
摘要： A computer-implemented method and apparatus is provided for processing a spoken request from a user. A speech recognizer converts the spoken request into a digital format. A frame data structure associates semantic components of the digitized spoken request with predetermined slots. The slots are indicative of data which are used to achieve a predetermined goal. A speech understanding module which is connected to the speech recognizer and to the frame data structure determines semantic components of the spoken request. The slots are populated based upon the determined semantic components. A dialog manager which is connected to the speech understanding module may determine at least one slot which is unpopulated based upon the determined semantic components and in a preferred embodiment may provide confirmation of the populated slots. A computer generated-request is formulated in order for the user to provide data related to the unpopulated slot. The method and apparatus are well-suited (but not limited) to use in a hand-held speech translation device.
摘要翻译：提供了一种用于处理来自用户的口头请求的计算机实现的方法和装置。语音识别器将口头请求转换为数字格式。帧数据结构将数字化语音请求的语义分量与预定时隙相关联。这些时隙指示用于实现预定目标的数据。连接到语音识别器和帧数据结构的语音理解模块确定语音请求的语义分量。基于确定的语义分量来填充时隙。连接到语音理解模块的对话管理器可以基于所确定的语义组件来确定未填充的至少一个时隙，并且在优选实施例中可以提供填充时隙的确认。制定计算机生成请求以便用户提供与未填充槽相关的数据。该方法和装置非常适合（但不限于）在手持语音翻译装置中使用。

9. 发明授权

US06230131B1 Method for generating spelling-to-pronunciation decision tree 失效
标题翻译：拼写到发音决策树的方法
公开(公告)号：US06230131B1
公开(公告)日：2001-05-08
申请号：US09069308
申请日：1998-04-29
申请人： Roland Kuhn , Jean-Claude Junqua , Matteo Contolini
发明人： Roland Kuhn , Jean-Claude Junqua , Matteo Contolini
IPC分类号： G10L1308
CPC分类号： G10L13/08
摘要： Decision trees are used to store a series of yes-no questions that can be used to convert spelled-word letter sequences into pronunciations. Letter-only trees, having internal nodes populated with questions about letters in the input sequence, generate one or more pronunciations based on probability data stored in the leaf nodes of the tree. The pronunciations may then be improved by processing them using mixed trees which are populated with questions about letters in the sequence and also questions about phonemes associated with those letters. The mixed tree screens out pronunciations that would not occur in natural speech, thereby greatly improving the results of the letter-to-pronunciation transformation.
摘要翻译：决策树用于存储可用于将拼写字母序列转换为发音的一系列“是”的问题。仅有信息树，内部节点填充有关输入序列中的字母的问题，根据存储在树的叶节点中的概率数据生成一个或多个发音。然后可以通过使用填充有序列中的字母的问题的混合树以及与这些字母相关的音素的问题来处理它们来发音。混合树屏蔽了自然语言中不会发生的发音，从而大大提高了字母到发音转换的结果。

10. 发明授权

US06711541B1 Technique for developing discriminative sound units for speech recognition and allophone modeling 有权
标题翻译：用于发展用于语音识别和异音素建模的辨别声音单元的技术
公开(公告)号：US06711541B1
公开(公告)日：2004-03-23
申请号：US09390434
申请日：1999-09-07
申请人： Roland Kuhn , Jean-Claude Junqua , Matteo Contolini
发明人： Roland Kuhn , Jean-Claude Junqua , Matteo Contolini
IPC分类号： G10L1504
CPC分类号： G10L15/063 , G10L2015/025
摘要： A set of models is developed to represent sound units and these models are then used with the incorrect sound units to determine which generate high likelihood scores. The models generating high likelihood scores for the incorrect sound units represent those that are more likely to be confused. The resulting confusability data may then be used in generating more discriminative speech models and in subsequent pruning of the acoustic decision tree. The confusability data may also be used to develop confusability predictors used for rejection during search and in developing continuous speech recognition models that are optimized to minimize confusability.
摘要翻译：开发了一组模型来表示声音单元，然后将这些模型与不正确的声音单元一起使用以确定哪个产生高似然分数。为不正确声音单位产生高似然分数的模型代表更可能被混淆的那些。所产生的可混淆性数据然后可以用于产生更具歧视性的语音模型以及随后的声学决策树的修剪。可混淆性数据还可用于开发用于搜索期间拒绝的混淆性预测变量，并开发出经过优化以最小化混淆性的连续语音识别模型。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式