会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 42. 发明授权
    • Method of memory management in speech recognition
    • 语音识别中的内存管理方法
    • US06374222B1
    • 2002-04-16
    • US09354486
    • 1999-07-16
    • Yu-Hung Kao
    • Yu-Hung Kao
    • G10L1514
    • G10L15/285G10L15/08G10L15/187G10L2015/085
    • A memory management method is described for reducing the size of memory required in speech recognition searching. The searching involves parsing the input speech and building a dynamically changing search tree. The basic unit of the search network is a slot. The present invention describes ways of reducing the size of the slot and therefore the size of the required memory. The slot size is reduced by removing the time index, by the model_index and state_index being packed and by a coding for last_time field where one bit represents a slot is available for reuse and a second bit is for backtrace update.
    • 描述了用于减少语音识别搜索所需的存储器的大小的存储器管理方法。 搜索涉及解析输入语音并构建动态变化的搜索树。 搜索网络的基本单位是一个插槽。 本发明描述了减小时隙尺寸以及所需存储器大小的方法。 通过删除时间索引,通过被打包的model_index和state_index以及通过对last_time字段的编码来减少时隙索引,其中一个位表示一个时隙可用于重用,第二个位用于追溯更新。
    • 43. 发明授权
    • Speech recognition system and method
    • 语音识别系统和方法
    • US06370505B1
    • 2002-04-09
    • US09302370
    • 1999-04-30
    • Julian Odell
    • Julian Odell
    • G10L1514
    • G10L15/144G10L15/10
    • The present invention relates to a method of processing speech, in which input speech is processed to determine an input speech vector (or) representing a sample of the speech. A number of possible output states are defined with each output state (j) being represented by a number of state mixture components (m). Each state mixture component is then approximated by a weighted sum of a number of predetermined generic components (x), allowing the likelihoods of each output states (j) corresponding to the input speech vector (or) to be determined.
    • 本发明涉及一种处理语音的方法,其中处理输入语音以确定表示语音样本的输入语音向量(或)。 定义了多个可能的输出状态,每个输出状态(j)由多个状态混合分量(m)表示。 每个状态混合分量然后通过预定的通用分量(x)的数量的加权和近似,从而允许确定与输入语音向量(或)相对应的每个输出状态(j)的似然性。
    • 45. 发明授权
    • Voice processing and verification system
    • 语音处理和验证系统
    • US06233556B1
    • 2001-05-15
    • US09216622
    • 1998-12-16
    • Remco TeunenBen Shahshahani
    • Remco TeunenBen Shahshahani
    • G10L1514
    • G10L17/20G10L15/20G10L21/0216
    • A voice processing and verification system accounts for variations dependent upon telephony equipment differences. Models are developed for the various types of telephony equipment from many users speaking on each of the types of equipment. A transformation algorithm is determined for making a transformation between each of the various types of equipment to each of the others. In other words, a model is formed for carbon button telephony equipment from many users. Similarly, a model is formed for electret telephony equipment from many users, and for cellular telephony equipment from many users. During an enrollment, a user speaks to the system. The system forms and stores a model of the user's speech. The type of telephony equipment used in the original enrollment session is also detected and stored along with the enrollment voice model. The system determines the types of telephony equipment being used based upon the spectrum of sound it receives. The telephony equipment type determination is based upon models formed for each of the telephony equipment types spoken by many different users. Thereafter, when a current user calls in, his/her voice will be compared to the stored model if the same telephony equipment as used in the enrollment is determined. If the user calls in on another type of equipment than that used during the enrollment, the transformation for telephony equipment is applied to the model. The user's voice is then verified against the transformed model. This improves the error rate resulting from different telephony equipment types.
    • 语音处理和验证系统考虑到电话设备差异的变化。 模型是针对各种类型的设备的各种类型的电话设备开发的。 确定变换算法用于在各种类型的设备之间进行每个其他设备之间的转换。 换句话说,形成了许多用户的碳按钮电话设备的模型。 类似地,形成了来自许多用户的驻极体电话设备以及来自许多用户的蜂窝电话设备的模型。 在注册期间,用户对系统说话。 系统形成并存储用户演讲的模型。 在原始注册会话中使用的电话设备的类型也与注册语音模型一起被检测和存储。 系统根据接收到的声音频谱确定正在使用的电话设备的类型。 电话设备类型确定基于由许多不同用户所说的每种电话设备类型形成的模型。 此后,当当前用户呼入时,如果确定登记中使用的相同的电话设备,则他/她的语音将与存储的模型进行比较。 如果用户在登录时使用的其他类型的设备进行呼叫,则将电话设备的转换应用于该模型。 然后根据转换的模型验证用户的语音。 这改善了由不同电话设备类型引起的错误率。
    • 46. 发明授权
    • Decoding input symbols to input/output hidden markoff models
    • 将输入符号解码为输入/输出隐藏的标记模型
    • US06226613B1
    • 2001-05-01
    • US09183474
    • 1998-10-30
    • William Turin
    • William Turin
    • G10L1514
    • G10L15/142
    • The invention provides an information decoding system which takes advantage of the finite duration of channel memory and other distortions to permit efficient decoding of hidden Markov modeled information while storing only a subset of matrices used by the previous art. The invention may be applied to the maximum a posteriori (MAP) estimation of the input symbols of an input-output hidden Markov model, which can be described by the input-output transition probability density matrices or, alternatively, by finite-state systems. The invention is also applied to MAP decoding of information transmitted over channels with bursts of errors, to handwriting and speech recognition and other probabilistic systems as well.
    • 本发明提供一种信息解码系统,其利用信道存储器的有限持续时间和其他失真来允许对隐马尔可夫模型化信息的有效解码,同时仅存储先前技术使用的矩阵子集。 本发明可以应用于输入 - 输出隐马尔可夫模型的输入符号的最大后验(MAP)估计,其可以由输入 - 输出转移概率密度矩阵描述,或者可选地由有限状态系统描述。 本发明还应用于通过具有错误突发的信道发送的信息到手写和语音识别等概率系统的MAP解码。
    • 47. 发明授权
    • Speaker adaptation device and speech recognition device
    • 扬声器适配装置和语音识别装置
    • US06223159B1
    • 2001-04-24
    • US09217928
    • 1998-12-22
    • Jun Ishii
    • Jun Ishii
    • G10L1514
    • G10L15/065G10L15/144
    • Voice feature quantity extractor extracts feature vector time-series data by acoustic feature quantity analysis of the speaker's voice. Reference speaker-dependent conversion factor computation device computes reference speaker-dependent conversion factors through use of a reference speaker voice data feature vector and an initial standard pattern. The reference speaker-dependent conversion factors are stored in a reference speaker-dependent conversion factor storage device. Speaker-dependent conversion factor selector selects one or more sets of reference speaker-dependent conversion factors stored in the reference speaker-dependent conversion factor storage device. Speaker-dependent conversion factor computation device computes speaker-dependent conversion factors, through use of the selected one or more sets of reference speaker-dependent conversion factors. Speaker-dependent standard pattern computation device converts parameters of the initial standard pattern, through use of the speaker-dependent conversion factors, and thus-converted parameters are output as a speaker-dependent standard pattern.
    • 语音特征量提取器通过声音特征量分析提取特征向量时间序列数据。 参考扬声器相关转换因子计算装置通过使用参考扬声器语音数据特征向量和初始标准模式来计算参考的说话者相关转换因子。 参考扬声器相关的转换因子存储在参考扬声器相关的转换因子存储设备中。 扬声器相关转换因子选择器选择存储在参考扬声器相关转换因子存储设备中的一组或多组参考扬声器相关转换因子。 扬声器相关转换因子计算装置通过使用所选择的一组或多组参考扬声器相关转换因子来计算与扬声器相关的转换因子。 扬声器相关的标准模式计算装置通过使用与扬声器相关的转换因子来转换初始标准模式的参数,并将所转换的参数作为与扬声器相关的标准模式输出。
    • 48. 发明授权
    • Symbol insertion apparatus and method
    • 符号插入装置和方法
    • US06778958B1
    • 2004-08-17
    • US09651679
    • 2000-08-30
    • Masafumi NishimuraNobuyasu ItohShinsuke Mori
    • Masafumi NishimuraNobuyasu ItohShinsuke Mori
    • G10L1514
    • G10L15/18
    • An apparatus and method are provided for the insertion of punctuation marks into appropriate positions in a sentence. An acoustic processor processes input utterances to extract voice data, and transforms the data into a feature vector. When the automatic insertion of punctuation marks is not performed, a language decoder processes the feature vector using only a general-purpose language model, and inserts a comma at a location marked in the voice data by the entry “ten,” for example, which is clearly a location at which a comma should be inserted. When automatic punctuation insertion is performed, the language decoder employs the general-purpose language model and the punctuation mark language model to identify an unvoiced, pause location for the insertion of a punctuation mark, such as a comma.
    • 提供了一种用于将标点符号插入句子中的适当位置的装置和方法。 声处理器处理输入话语以提取语音数据,并将数据转换成特征向量。 当不执行标点符号的自动插入时,语言解码器仅使用通用语言模型来处理特征向量,并且通过条目“十”在标记在语音数据中的位置处插入逗号,哪个 显然是一个插入逗号的位置。 当执行自动标点符号插入时,语言解码器采用通用语言模型和标点符号语言模型来识别用于插入诸如逗号的标点符号的清音暂停位置。
    • 49. 发明授权
    • Method for estimating a confidence measure for a speech recognition system
    • 用于估计语音识别系统的置信度量度的方法
    • US06735562B1
    • 2004-05-11
    • US09588163
    • 2000-06-05
    • Yaxin ZhangHo Chuen ChoiJian Ming Song
    • Yaxin ZhangHo Chuen ChoiJian Ming Song
    • G10L1514
    • G10L15/01
    • A method of estimating a confidence measure for a speech recognition system, involves comparing an input speech signal with a number of predetermined models of possible speech signals. Best scores indicating the degree of similarity between the input speech signal and each of the predetermined models are then used to determine a normalized variance, which is used as the Confidence Measure, in order to determine whether the input speech signal has been correctly recognized, the Confidence Measure is compared to a threshold value. The threshold value is weighted according to the Signal to Noise Ratio of the input speech signal and according to the number of predetermined models used.
    • 一种估计语音识别系统的置信度测量的方法,包括将输入语音信号与可能的语音信号的多个预定模型进行比较。 然后使用表示输入语音信号与每个预定模型之间的相似程度的最佳分数来确定用作置信度量的归一化方差,以便确定输入语音信号是否已被正确识别, 将置信度与阈值进行比较。 阈值根据输入语音信号的信噪比和根据所使用的预定模型的数量进行加权。