专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

41. 发明授权

US06389395B1 System and method for generating a phonetic baseform for a word and using the generated baseform for speech recognition 失效
标题翻译：用于为单词生成语音基本形式并使用生成的基础形式进行语音识别的系统和方法
公开(公告)号：US06389395B1
公开(公告)日：2002-05-14
申请号：US08817072
申请日：1997-04-04
申请人： Simon P. Ringland
发明人： Simon P. Ringland
IPC分类号： G10L1514
CPC分类号： G10L15/063 , G10L2015/025
摘要： Out-of-vocabulary word models for a speech recognizer vocabulary are generated by forming phonemic transcriptions (phonetic baseforms) of user's utterances in terms of existing reference phonemes by using a speech recognition algorithm to match input sub-word feature sample sequences to suitably-constrained allowable sequences of existing reference phoneme features. The resultant new-vocabulary-word phonetic baseform models are stored for subsequent speech recognition using the same recognition algorithm.
摘要翻译：通过使用语音识别算法将输入子字特征样本序列匹配到适当约束的语音识别算法，通过用现有参考音素形成用户话语的音素转录（语音基础形式）来产生用于语音识别器词汇的词汇单词模型现有参考音素特征的允许序列。所得到的新词汇词语音基础模型被存储用于使用相同的识别算法的后续语音识别。

42. 发明授权

US06374222B1 Method of memory management in speech recognition 有权
标题翻译：语音识别中的内存管理方法
公开(公告)号：US06374222B1
公开(公告)日：2002-04-16
申请号：US09354486
申请日：1999-07-16
申请人： Yu-Hung Kao
发明人： Yu-Hung Kao
IPC分类号： G10L1514
CPC分类号： G10L15/285 , G10L15/08 , G10L15/187 , G10L2015/085
摘要： A memory management method is described for reducing the size of memory required in speech recognition searching. The searching involves parsing the input speech and building a dynamically changing search tree. The basic unit of the search network is a slot. The present invention describes ways of reducing the size of the slot and therefore the size of the required memory. The slot size is reduced by removing the time index, by the model_index and state_index being packed and by a coding for last_time field where one bit represents a slot is available for reuse and a second bit is for backtrace update.
摘要翻译：描述了用于减少语音识别搜索所需的存储器的大小的存储器管理方法。搜索涉及解析输入语音并构建动态变化的搜索树。搜索网络的基本单位是一个插槽。本发明描述了减小时隙尺寸以及所需存储器大小的方法。通过删除时间索引，通过被打包的model_index和state_index以及通过对last_time字段的编码来减少时隙索引，其中一个位表示一个时隙可用于重用，第二个位用于追溯更新。

43. 发明授权

US06370505B1 Speech recognition system and method 失效
标题翻译：语音识别系统和方法
公开(公告)号：US06370505B1
公开(公告)日：2002-04-09
申请号：US09302370
申请日：1999-04-30
申请人： Julian Odell
发明人： Julian Odell
IPC分类号： G10L1514
CPC分类号： G10L15/144 , G10L15/10
摘要： The present invention relates to a method of processing speech, in which input speech is processed to determine an input speech vector (or) representing a sample of the speech. A number of possible output states are defined with each output state (j) being represented by a number of state mixture components (m). Each state mixture component is then approximated by a weighted sum of a number of predetermined generic components (x), allowing the likelihoods of each output states (j) corresponding to the input speech vector (or) to be determined.
摘要翻译：本发明涉及一种处理语音的方法，其中处理输入语音以确定表示语音样本的输入语音向量（或）。定义了多个可能的输出状态，每个输出状态（j）由多个状态混合分量（m）表示。每个状态混合分量然后通过预定的通用分量（x）的数量的加权和近似，从而允许确定与输入语音向量（或）相对应的每个输出状态（j）的似然性。

44. 发明授权

US06308157B1 Method and apparatus for providing an event-based “What-Can-I-Say?” window 有权
标题翻译：提供基于事件的“我可以说什么”的方法和装置？窗口
公开(公告)号：US06308157B1
公开(公告)日：2001-10-23
申请号：US09328095
申请日：1999-06-08
申请人： Ronald E. Vanbuskirk , James R. Lewis , Kerry A. Ortega , Huifang Wang , Amado Nassiff
发明人： Ronald E. Vanbuskirk , James R. Lewis , Kerry A. Ortega , Huifang Wang , Amado Nassiff
IPC分类号： G10L1514
CPC分类号： G10L15/26 , G10L2015/228
摘要： A method and system efficiently identifies voice commands for a user of a speech recognition system. The method involves a series of steps including: receiving input from a user; monitoring the computer system to log system events and ascertain a current system state; predicting a probable next event according to the current system state and logged events; and identifying acceptable voice commands to perform the next event. The system events include commands, system control activities, timed activities, and application activation. These events are statistically analyzed in light of the current system state to determine the probable next event. The voice commands for performing the probable next event are displayed to the user.
摘要翻译：方法和系统有效地识别用于语音识别系统的用户的语音命令。该方法涉及一系列步骤，包括：从用户接收输入; 监视计算机系统以记录系统事件并确定当前的系统状态; 根据当前的系统状态和记录的事件预测可能的下一个事件; 并且识别可接受的语音命令以执行下一个事件。系统事件包括命令，系统控制活动，定时活动和应用程序激活。根据当前系统状态对这些事件进行统计分析，以确定可能的下一个事件。用于执行可能的下一个事件的语音命令被显示给用户。

45. 发明授权

US06233556B1 Voice processing and verification system 有权
标题翻译：语音处理和验证系统
公开(公告)号：US06233556B1
公开(公告)日：2001-05-15
申请号：US09216622
申请日：1998-12-16
申请人： Remco Teunen , Ben Shahshahani
发明人： Remco Teunen , Ben Shahshahani
IPC分类号： G10L1514
CPC分类号： G10L17/20 , G10L15/20 , G10L21/0216
摘要： A voice processing and verification system accounts for variations dependent upon telephony equipment differences. Models are developed for the various types of telephony equipment from many users speaking on each of the types of equipment. A transformation algorithm is determined for making a transformation between each of the various types of equipment to each of the others. In other words, a model is formed for carbon button telephony equipment from many users. Similarly, a model is formed for electret telephony equipment from many users, and for cellular telephony equipment from many users. During an enrollment, a user speaks to the system. The system forms and stores a model of the user's speech. The type of telephony equipment used in the original enrollment session is also detected and stored along with the enrollment voice model. The system determines the types of telephony equipment being used based upon the spectrum of sound it receives. The telephony equipment type determination is based upon models formed for each of the telephony equipment types spoken by many different users. Thereafter, when a current user calls in, his/her voice will be compared to the stored model if the same telephony equipment as used in the enrollment is determined. If the user calls in on another type of equipment than that used during the enrollment, the transformation for telephony equipment is applied to the model. The user's voice is then verified against the transformed model. This improves the error rate resulting from different telephony equipment types.
摘要翻译：语音处理和验证系统考虑到电话设备差异的变化。模型是针对各种类型的设备的各种类型的电话设备开发的。确定变换算法用于在各种类型的设备之间进行每个其他设备之间的转换。换句话说，形成了许多用户的碳按钮电话设备的模型。类似地，形成了来自许多用户的驻极体电话设备以及来自许多用户的蜂窝电话设备的模型。在注册期间，用户对系统说话。系统形成并存储用户演讲的模型。在原始注册会话中使用的电话设备的类型也与注册语音模型一起被检测和存储。系统根据接收到的声音频谱确定正在使用的电话设备的类型。电话设备类型确定基于由许多不同用户所说的每种电话设备类型形成的模型。此后，当当前用户呼入时，如果确定登记中使用的相同的电话设备，则他/她的语音将与存储的模型进行比较。如果用户在登录时使用的其他类型的设备进行呼叫，则将电话设备的转换应用于该模型。然后根据转换的模型验证用户的语音。这改善了由不同电话设备类型引起的错误率。

46. 发明授权

US06226613B1 Decoding input symbols to input/output hidden markoff models 有权
标题翻译：将输入符号解码为输入/输出隐藏的标记模型
公开(公告)号：US06226613B1
公开(公告)日：2001-05-01
申请号：US09183474
申请日：1998-10-30
申请人： William Turin
发明人： William Turin
IPC分类号： G10L1514
CPC分类号： G10L15/142
摘要： The invention provides an information decoding system which takes advantage of the finite duration of channel memory and other distortions to permit efficient decoding of hidden Markov modeled information while storing only a subset of matrices used by the previous art. The invention may be applied to the maximum a posteriori (MAP) estimation of the input symbols of an input-output hidden Markov model, which can be described by the input-output transition probability density matrices or, alternatively, by finite-state systems. The invention is also applied to MAP decoding of information transmitted over channels with bursts of errors, to handwriting and speech recognition and other probabilistic systems as well.
摘要翻译：本发明提供一种信息解码系统，其利用信道存储器的有限持续时间和其他失真来允许对隐马尔可夫模型化信息的有效解码，同时仅存储先前技术使用的矩阵子集。本发明可以应用于输入 - 输出隐马尔可夫模型的输入符号的最大后验（MAP）估计，其可以由输入 - 输出转移概率密度矩阵描述，或者可选地由有限状态系统描述。本发明还应用于通过具有错误突发的信道发送的信息到手写和语音识别等概率系统的MAP解码。

47. 发明授权

US06223159B1 Speaker adaptation device and speech recognition device 失效
标题翻译：扬声器适配装置和语音识别装置
公开(公告)号：US06223159B1
公开(公告)日：2001-04-24
申请号：US09217928
申请日：1998-12-22
申请人： Jun Ishii
发明人： Jun Ishii
IPC分类号： G10L1514
CPC分类号： G10L15/065 , G10L15/144
摘要： Voice feature quantity extractor extracts feature vector time-series data by acoustic feature quantity analysis of the speaker's voice. Reference speaker-dependent conversion factor computation device computes reference speaker-dependent conversion factors through use of a reference speaker voice data feature vector and an initial standard pattern. The reference speaker-dependent conversion factors are stored in a reference speaker-dependent conversion factor storage device. Speaker-dependent conversion factor selector selects one or more sets of reference speaker-dependent conversion factors stored in the reference speaker-dependent conversion factor storage device. Speaker-dependent conversion factor computation device computes speaker-dependent conversion factors, through use of the selected one or more sets of reference speaker-dependent conversion factors. Speaker-dependent standard pattern computation device converts parameters of the initial standard pattern, through use of the speaker-dependent conversion factors, and thus-converted parameters are output as a speaker-dependent standard pattern.
摘要翻译：语音特征量提取器通过声音特征量分析提取特征向量时间序列数据。参考扬声器相关转换因子计算装置通过使用参考扬声器语音数据特征向量和初始标准模式来计算参考的说话者相关转换因子。参考扬声器相关的转换因子存储在参考扬声器相关的转换因子存储设备中。扬声器相关转换因子选择器选择存储在参考扬声器相关转换因子存储设备中的一组或多组参考扬声器相关转换因子。扬声器相关转换因子计算装置通过使用所选择的一组或多组参考扬声器相关转换因子来计算与扬声器相关的转换因子。扬声器相关的标准模式计算装置通过使用与扬声器相关的转换因子来转换初始标准模式的参数，并将所转换的参数作为与扬声器相关的标准模式输出。

48. 发明授权

US06778958B1 Symbol insertion apparatus and method 失效
标题翻译：符号插入装置和方法
公开(公告)号：US06778958B1
公开(公告)日：2004-08-17
申请号：US09651679
申请日：2000-08-30
申请人： Masafumi Nishimura , Nobuyasu Itoh , Shinsuke Mori
发明人： Masafumi Nishimura , Nobuyasu Itoh , Shinsuke Mori
IPC分类号： G10L1514
CPC分类号： G10L15/18
摘要： An apparatus and method are provided for the insertion of punctuation marks into appropriate positions in a sentence. An acoustic processor processes input utterances to extract voice data, and transforms the data into a feature vector. When the automatic insertion of punctuation marks is not performed, a language decoder processes the feature vector using only a general-purpose language model, and inserts a comma at a location marked in the voice data by the entry “ten,” for example, which is clearly a location at which a comma should be inserted. When automatic punctuation insertion is performed, the language decoder employs the general-purpose language model and the punctuation mark language model to identify an unvoiced, pause location for the insertion of a punctuation mark, such as a comma.
摘要翻译：提供了一种用于将标点符号插入句子中的适当位置的装置和方法。声处理器处理输入话语以提取语音数据，并将数据转换成特征向量。当不执行标点符号的自动插入时，语言解码器仅使用通用语言模型来处理特征向量，并且通过条目“十”在标记在语音数据中的位置处插入逗号，哪个显然是一个插入逗号的位置。当执行自动标点符号插入时，语言解码器采用通用语言模型和标点符号语言模型来识别用于插入诸如逗号的标点符号的清音暂停位置。

49. 发明授权

US06735562B1 Method for estimating a confidence measure for a speech recognition system 有权
标题翻译：用于估计语音识别系统的置信度量度的方法
公开(公告)号：US06735562B1
公开(公告)日：2004-05-11
申请号：US09588163
申请日：2000-06-05
申请人： Yaxin Zhang , Ho Chuen Choi , Jian Ming Song
发明人： Yaxin Zhang , Ho Chuen Choi , Jian Ming Song
IPC分类号： G10L1514
CPC分类号： G10L15/01
摘要： A method of estimating a confidence measure for a speech recognition system, involves comparing an input speech signal with a number of predetermined models of possible speech signals. Best scores indicating the degree of similarity between the input speech signal and each of the predetermined models are then used to determine a normalized variance, which is used as the Confidence Measure, in order to determine whether the input speech signal has been correctly recognized, the Confidence Measure is compared to a threshold value. The threshold value is weighted according to the Signal to Noise Ratio of the input speech signal and according to the number of predetermined models used.
摘要翻译：一种估计语音识别系统的置信度测量的方法，包括将输入语音信号与可能的语音信号的多个预定模型进行比较。然后使用表示输入语音信号与每个预定模型之间的相似程度的最佳分数来确定用作置信度量的归一化方差，以便确定输入语音信号是否已被正确识别，将置信度与阈值进行比较。阈值根据输入语音信号的信噪比和根据所使用的预定模型的数量进行加权。

50. 发明授权

US06725195B2 Method and apparatus for probabilistic recognition using small number of state clusters 有权
标题翻译：使用少量状态簇的概率识别的方法和装置
公开(公告)号：US06725195B2
公开(公告)日：2004-04-20
申请号：US10029420
申请日：2001-10-22
申请人： Ananth Sankar , Venkata Ramana Rao Gadde
发明人： Ananth Sankar , Venkata Ramana Rao Gadde
IPC分类号： G10L1514
CPC分类号： G10L15/06 , G10L15/144 , G10L2015/0631
摘要： Probabilistic recognition using clusters and simple probability functions provides improved performance by employing a limited number of clusters each using a relatively large number of simple probability functions. The simple probability functions for each of the limited number of state clusters are greater in number than the limited number of state clusters.
摘要翻译：使用群集和简单概率函数的概率识别通过采用有限数量的群集来提供改进的性能，每个簇使用相对大量的简单概率函数。有限数量的状态簇中的每一个的简单概率函数在数量上大于有限数量的状态簇。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式