会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Voice dialing server for branch exchange telephone systems
    • 分机交换电话系统的语音拨号服务器
    • US5930336A
    • 1999-07-27
    • US723914
    • 1996-09-30
    • Jean-Claude JunquaPhilippe R. MorinTed H. Applebaum
    • Jean-Claude JunquaPhilippe R. MorinTed H. Applebaum
    • G10L15/02G10L15/10H04M3/42H04M3/493H04Q3/58H04Q3/62H04M1/64
    • H04M3/4931G10L15/02G10L15/10H04M3/42204H04M3/42314H04Q3/627G10L2015/228H04M2201/40
    • The voice dialing server plugs into one or more unused extensions of a branch exchange system to provide each of the users on the system with voice dialing services. To use the system a user simply dials the extension to which the server is attached. The server then prompts the user to supply the name of a party to be called. The name is then looked up in a telephone number dictionary unique to that user. The system then places the telephone call by sending commands to the branch exchange system that simulate the operations a user would perform to connect to an outside line or inside extension and then place the call. The server incorporates a speech processing module having a multistage word recognizer that represents speech in terms of high phoneme similarity values. This representation is highly compact, allowing the word recognizer to perform the recognizer and fine match stages with far less processor overhead than frame-by-frame speech recognizers.
    • 语音拨号服务器插入分支交换系统的一个或多个未使用的分机,以向系统中的每个用户提供语音拨号服务。 要使用系统,用户只需拨打服务器所连接的扩展名。 服务器然后提示用户提供被叫方的名称。 然后在该用户唯一的电话号码字典中查找该名称。 然后,该系统通过发送命令发送电话给分支交换系统,该系统模拟用户将执行的连接到外线或内部分机的操作,然后进行呼叫。 该服务器包括具有多字词识别器的语音处理模块,其以高音素相似度值表示语音。 该表示非常紧凑,允许字识别器执行识别器和精细匹配阶段,而且比逐帧语音识别器远远少于处理器开销。
    • 3. 发明授权
    • Speech recognition training for small hardware devices
    • 小型硬件设备语音识别培训
    • US06463413B1
    • 2002-10-08
    • US09295276
    • 1999-04-20
    • Ted H. ApplebaumJean-Claude Junqua
    • Ted H. ApplebaumJean-Claude Junqua
    • G10L1514
    • G10L15/30G10L15/06G10L15/187G10L15/22G10L2015/0638
    • A distributed speech processing system for constructing speech recognition reference models that are to be used by a speech recognizer in a small hardware device, such as a personal digital assistant or cellular telephone. The speech processing system includes a speech recognizer residing on a first computing device and a speech model server residing on a second computing device. The speech recognizer receives speech training data and processes it into an intermediate representation of the speech training data. The intermediate representation is then communicated to the speech model server. The speech model server generates a speech reference model by using the intermediate representation of the speech training data and then communicates the speech reference model back to the first computing device for storage in a lexicon associated with the speech recognizer.
    • 一种用于构建语音识别参考模型的分布式语音处理系统,该语音识别参考模型将被诸如个人数字助理或蜂窝电话之类的小型硬件设备中的语音识别器使用。 语音处理系统包括位于第一计算设备上的语音识别器和位于第二计算设备上的语音模型服务器。 语音识别器接收语音训练数据并将其处理成语音训练数据的中间表示。 然后将中间表示传递给语音模型服务器。 语音模型服务器通过使用语音训练数据的中间表示来生成语音参考模型,然后将语音参考模型传送回第一计算设备以存储在与语音识别器相关联的词典中。
    • 5. 发明授权
    • Segment-based similarity method for low complexity speech recognizer
    • 低复杂度语音识别器的基于段的相似度法
    • US06230129B1
    • 2001-05-08
    • US09199721
    • 1998-11-25
    • Philippe R. MorinTed H. Applebaum
    • Philippe R. MorinTed H. Applebaum
    • G10L1502
    • G10L15/10G10L2015/025
    • A digital word prototype is constructed using one or more speech utterance for a given spoken word or phrase. First, a phone model is used to derive phoneme similarity time series for each of a plurality of phonemes which represent the degree of similarity between the speech utterance and a set of standard phonemes contained in the phone model. Next, the phoneme similarity data is normalized in relation to a non-speech part of the input speech signal. The normalized phoneme similarity data is divided into segments, such that the sum of all normalized phoneme similarity values in a segment are equal for each segment. Next, a word model is constructed from the phoneme similarity data. To do so, within each segment, a summation value is determined by summing over speech frames each of the normalized phoneme similarity values associated with a particular phoneme. In this way, the word model is represented by a vector of summation values that compactly correlate to the normalized phoneme similarity data. Lastly, the results of the individually processed utterances for a given spoken word (i.e., the individual word models) are combined to produce a digital word prototype that electronically represents the given spoken word.
    • 使用针对给定口语单词或短语的一个或多个语音说话来构建数字词原型。 首先,使用电话模型来导出多个音素中的每一个的音素相似度时间序列,这些音素表示语音话语和包含在电话模型中的一组标准音素之间的相似程度。 接下来,音素相似度数据相对于输入语音信号的非语音部分被归一化。 归一化的音素相似度数据被划分成段,使得段中的所有归一化音素相似度之和相等于每个段。 接下来,从音素相似度数据构建单词模型。 为了这样做,在每个段内,通过对与特定音素相关联的每个标准化音素相似度的语音帧求和来确定求和值。 以这种方式,词模型由与归一化音素相似度数据紧密相关的求和值的向量表示。 最后,将给定口语单词(即,单词模型)的单独处理的话语的结果组合以产生电子地表示给定口语单词的数字词原型。
    • 7. 发明授权
    • Multistage word recognizer based on reliably detected phoneme similarity
regions
    • 基于可靠检测的音素相似区域的多级字识别器
    • US5822728A
    • 1998-10-13
    • US526746
    • 1995-09-08
    • Ted H. ApplebaumPhilippe R. Morin
    • Ted H. ApplebaumPhilippe R. Morin
    • G10L15/02G10L15/08G10L7/08
    • G10L15/08G10L15/02
    • The multistage word recognizer uses a word reference representation based on reliably detected peaks of phoneme similarity values. The word reference representation captures the basic features of the words by targets that describe the location and shape of stable peaks of phoneme similarity values. The first stage of the word hypothesizer represents each reference word with statistical information on the number of high similarity regions over a predefined number of time intervals. The second stage represents each word by a prototype that consists of a series of phoneme targets and global statistics, namely the average word duration and average match rate. These represent the degree of fit of the word prototype to its training data. Word recognition scores generated in the two stages are converted to dimensionless normalized values and combined by averaging for use in selecting the most probable word candidates.
    • 多级字识别器使用基于可靠检测的音素相似度峰值的字参考表示。 词引用表示法通过描述音素相似度值的稳定峰的位置和形状的目标捕获词的基本特征。 单词假设器的第一阶段表示具有关于预定数量的时间间隔上的高相似性区域的数量的统计信息的每个参考词。 第二阶段由原型组成,每个单词由一系列音素目标和全球统计数据组成,即平均单词持续时间和平均匹配率。 这些代表词原型对其训练数据的拟合程度。 在两个阶段产生的词识别分数被转换为无量纲归一化值,并通过平均来组合用于选择最可能的词候选。
    • 10. 发明授权
    • Prosody template matching for text-to-speech systems
    • 用于文本到语音系统的韵律模板匹配
    • US06845358B2
    • 2005-01-18
    • US09755699
    • 2001-01-05
    • Nicholas KibreTed H. Applebaum
    • Nicholas KibreTed H. Applebaum
    • G10L13/04G10L13/08G06F17/21G10L13/06
    • G10L13/10
    • A prosody matching template in the form of a tree structure stores indices which point to lookup table and template information prescribing pitch and duration values that are used to add inflection to the output of a text-to-speech synthesizer. The lookup module employs a search algorithm that explores each branch of the tree, assigning penalty scores based on whether the syllable represented by a node of the tree does or does not match the corresponding syllable of the target word. The path with the lowest penalty score is selected as the index into the prosody template table. The system will add nodes by cloning existing nodes in cases where it is not possible to find a one-to-one match between the number of syllables in the target word and the number of nodes in the tree.
    • 以树结构形式的韵律匹配模板存储指向查找表和模板信息的索引,其中规定了用于向文本到语音合成器的输出添加拐点的音调和持续时间值。 查找模块采用探索树的每个分支的搜索算法,基于由树的节点表示的音节是否与目标词的相应音节不匹配来分配惩罚分数。 选择具有最低惩罚分数的路径作为韵律模板表的索引。 在不可能找到目标单词中的音节数与树中节点数之间的一对一匹配的情况下,系统将通过克隆现有节点来添加节点。