会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 74. 发明授权
    • Variable framerate parameter encoding
    • 可变帧率参数编码
    • US5806027A
    • 1998-09-08
    • US724268
    • 1996-09-19
    • E. Bryan GeorgeAlan V. McCreeVishu R. Viswanathan
    • E. Bryan GeorgeAlan V. McCreeVishu R. Viswanathan
    • G10L15/12G10L19/00G10L19/06G10L9/00
    • G10L19/07G10L19/0018G10L15/12
    • A novel approach to parameter encoding is presented which improves coding efficiency and performance by exploiting the variable rate nature of certain classes of signals. This is achieved using an interpolative variable frame-rate breakpointing scheme referred to as adaptive frame selection (AFS). In the approach described in this report, frame selection is achieved using a recursive dynamic programming algorithm; the resulting parameter encoding system is referred to as adaptive frame selection using dynamic programming (AFS/DP). The AFS/DP algorithm determines optimal breakpoint locations in the context of parameter encoding using an arbitrary objective performance measure, and operates in a fixed bit-rate, fixed-delay context with low computational requirements. When applied to the problem of low bit-rate coding of speech spectral and gain parameters, the AFS/DP algorithm is capable of improving the perceptual quality of coded speech and robustness to quantization errors over fixed frame-rate approaches.
    • 提出了一种新颖的参数编码方法,它通过利用某些信号类别的可变速率特性来提高编码效率和性能。 这是使用称为自适应帧选择(AFS)的内插可变帧速率断点方案来实现的。 在本报告中描述的方法中,使用递归动态规划算法实现帧选择; 所得到的参数编码系统被称为使用动态规划(AFS / DP)的自适应帧选择。 AFS / DP算法在使用任意目标性能测量的参数编码的上下文中确定最佳断点位置,并且在具有低计算要求的固定比特率,固定延迟上下文中操作。 当应用于语音频谱和增益参数的低比特率编码的问题时,AFS / DP算法能够在固定帧速率方法上提高编码语音的感知质量和对量化误差的鲁棒性。
    • 75. 发明授权
    • Pattern recognition system and method
    • 模式识别系统和方法
    • US5778342A
    • 1998-07-07
    • US595357
    • 1996-02-01
    • Adoram ErellDavid Burshtein
    • Adoram ErellDavid Burshtein
    • G10L15/06G10L15/12G10L15/14G10L15/20G10L5/06
    • G10L15/063G10L15/12G10L15/142G10L21/0216
    • A pattern recognition system and method is disclosed. The method includes the steps of a) providing a noisy test feature set of the input signal, a plurality of reference feature sets of reference templates produced in a quiet environment, and a background noise feature set of background noise present in the input signal, b) producing adapted reference templates from the test feature set, the background noise feature set and the reference feature sets and c) determining match scores defining the match between each of the adapted reference templates and the test feature set. The method can also include adapting the scores before accepting a score as the result. The system and method are described for both Hidden Markov Model (HMM) and Dynamic Time Warping (DTW) scoring units. The system performs the steps of the method.
    • 公开了一种图案识别系统和方法。 该方法包括以下步骤:a)提供输入信号的噪声测试特征集合,在安静环境中产生的参考模板的多个参考特征集合以及输入信号中存在的背景噪声背景噪声特征集合,b )从测试特征集,背景噪声特征集合和参考特征集合产生适应参考模板,以及c)确定定义每个适配参考模板和测试特征集之间的匹配的匹配分数。 该方法还可以包括在接受得分之前调整分数作为结果。 描述了隐马尔可夫模型(HMM)和动态时间扭曲(DTW)评分单元的系统和方法。 系统执行该方法的步骤。
    • 76. 发明授权
    • Voice recognition method for recognizing a word in speech
    • 用于识别语音中的单词的语音识别方法
    • US5692097A
    • 1997-11-25
    • US347089
    • 1994-11-23
    • Maki YamadaMasakatsu HoshimiTaisuke WatanabeKatsuyuki Niyada
    • Maki YamadaMasakatsu HoshimiTaisuke WatanabeKatsuyuki Niyada
    • G10L15/10G10L15/02G10L15/12G10L15/20G10L15/28G10L9/14
    • G10L15/12
    • An inter-frame similarity between an input voice and a standard patterned word is calculated for each of frames and for each of standard patterned words, and a posterior probability similarity is produced by subtracting a constant value from each of the inter-frame similarities. The constant value is determined by analyzing voice data obtained from specified persons to set the posterior probability similarities to positive values when a word existing in the input voice matches with the standard patterned word and to set the posterior probability similarities to negative values when a word existing in the input voice does not match with the standard patterned word. Thereafter, an accumulated similarity having an accumulated value obtained by accumulating values of the posterior probability similarities according to a continuous dynamic programming matching operation for the frames of the input voice is calculated for each of the standard patterned words. Thereafter, a particular standard patterned word relating to an accumulated similarity having a maximum value among the accumulated similarities is output as a recognized word of the input voice.
    • 针对每个帧和每个标准图案化字计算输入语音和标准图案化字之间的帧间相似度,并且通过从每个帧间相似性中减去常数值来产生后验概率相似性。 通过分析从指定人员获得的语音数据来确定常数值,以便当存在于输入语音中的单词与标准图案化词匹配时将后验概率相似性设置为正值,并且当存在词时将后验概率相似性设置为负值 在输入语音中与标准图案字不匹配。 此后,针对每个标准图案化字,计算累积相似度,该相似度具有根据输入声音的帧的连续动态规划匹配操作累积后验概率相似度的值而获得的累积值。 此后,输出与积累的相似度中具有最大值的累积相似度相关的特定标准图案化字作为输入语音的识别字。
    • 77. 发明授权
    • Speaker independent speech recognition system and method using neural
network and DTW matching technique
    • 使用神经网络和DTW匹配技术的扬声器独立语音识别系统和方法
    • US5528728A
    • 1996-06-18
    • US89825
    • 1993-07-12
    • Yoshihiro MatsuuraToby Skinner
    • Yoshihiro MatsuuraToby Skinner
    • G10L15/10G10L15/02G10L15/12G10L15/16G01L5/06G01L9/00
    • G10L15/16G10L15/12
    • Improved speaker independent speech recognition system and method are disclosed in which an utterance by an unspecified person into an electrical signal is input through a device such as a telephone, the electrical signal from the input telephone converting the electrical signal into a time series of characteristic multidimensional vectors, the time series of characteristic multidimensional vectors are received, each of the vectors being converted into a plurality of candidates so that the plurality of phonemes constitutes a plurality of strings of phonemes in time series as a plurality of candidates, the plurality of candidates of phonemes are compared simultaneously (one at a time) with a reference pattern of a reference string of phonemes for each word previously stored in a dictionary to determine which string of phonemes derived from the phoneme recognition means has a highest similarity to one of the reference strings of the phonemes for the respective words stored in the dictionary using a predetermined word matching technique, and at least one candidate of the words as a result of word recognition on the basis of one of the plurality of the strings of phonemes which has the highest similarity to the corresponding one of the reference strings of the respective words is output as the result of speech recognition.
    • 公开了改进的与说话者无关的语音识别系统和方法,其中未指定的人进入电信号的话语通过诸如电话的设备输入,来自输入电话的电信号将电信号转换成时间序列的特征多维 向量,接收特征多维向量的时间序列,每个矢量被转换成多个候选,使得多个音素以时间序列构成多个音素串作为多个候选,多个候选的 将音素同时比较(一次一个))与先前存储在字典中的每个字的音素的参考串的参考图案,以确定从音素识别装置导出的哪个音素字符串与参考串之一具有最高的相似性 的词汇存储在词典中的各个词 使用预定的字匹配技术,以及作为词识别的结果的单词的至少一个候选,所述多个字符串与所述多个音素中的相应一个参考字符串具有最高相似性的音素中的一个 作为语音识别的结果输出各个单词。
    • 79. 发明授权
    • System and method of pattern recognition employing a multiprocessing
pipelined apparatus with private pattern memory
    • 使用具有专用图案存储器的多处理流水线装置的模式识别的系统和方法
    • US5459798A
    • 1995-10-17
    • US60579
    • 1993-05-12
    • Delbert D. BaileyCarole Dulong
    • Delbert D. BaileyCarole Dulong
    • F02B75/02G06K9/62G10L15/12G10L15/14G10L15/28G06K9/68
    • G06K9/6297G10L15/285G10L15/34F02B2075/027G10L15/12G10L15/142
    • A computer implemented apparatus and method of pattern recognition utilizing a pattern recognition engine coupled with a general purpose computer system. The present invention system provides increased accuracy and performance in handwriting and voice recognition systems and may interface with general purpose computer systems. A pattern recognition engine is provided within the present invention that contains five pipelines which operate in parallel and are specially optimized for Dynamic Time Warping and Hidden Markov Models procedures for pattern recognition, especially handwriting recognition. These pipelines comprise two arithmetic pipelines, one control pipeline and two pointer pipelines. Further, a private memory is associated with each pattern recognition engine for library storage of reference or prototype patterns. Recognition procedures are partitioned across a CPU and the pattern recognition engine. Use of a private memory allows quick access of the library patterns without impeding the performance of programs operating on the main CPU or the host bus. Communication between the CPU and the pattern recognition engine is accomplished over the host bus.
    • 一种使用与通用计算机系统耦合的模式识别引擎的计算机实现的模式识别装置和方法。 本发明系统在手写和语音识别系统中提供增加的精度和性能,并且可以与通用计算机系统接口。 在本发明中提供了一种模式识别引擎,其包含并行操作的五个管线,并且针对动态时间扭曲和隐马尔可夫模型程序特别优化用于模式识别,特别是手写识别。 这些管道包括两个运算管线,一个控制流水线和两个指针管道。 此外,专用存储器与每个模式识别引擎相关联,用于参考或原型图案的库存储。 识别程序通过CPU和模式识别引擎进行分区。 使用专用内存可以快速访问库模式,而不会妨碍在主CPU或主机总线上运行的程序的性能。 CPU和模式识别引擎之间的通信是通过主机总线实现的。
    • 80. 发明授权
    • Speaker independent speech recognition method and system
    • 演讲者独立的语音识别方法和系统
    • US4908865A
    • 1990-03-13
    • US290816
    • 1988-12-22
    • George R. DoddingtonEnrico Bocchieri
    • George R. DoddingtonEnrico Bocchieri
    • G10L15/00
    • G10L15/12
    • Recognition of sound units is improved by comparing frame-pair feature vectors which helps compensate for context variations in the pronunciation of sound units. A plurality of reference frames are stored of reference feature vectors representing reference words. A linear predictive coder (10) generates a plurality of spectral feature vectors for each frame of the speech signals. A filter bank system (12) transforms the spectral feature vectors to filter bank representations. A principal feature vector transformer (14) transforms the filter bank representations to an identity matrix of transformed input feature vectors. A concatenate frame system (16) concatenates the input feature vectors of adjacent frames to form the feature vector of a frame-pair. A transformer (18) and a comparator (20) compute the likelihood that each input feature vector for a frame-pair was produced by each reference frame. This computation is performed individually and independently for each reference frame-pairs. A dynamic time warper (22) constructs an optimum time path through the input speech signals for each of the computed likelihoods. A high level decision logic (24) recognizes the input speech signals as one of the reference words in response to the computed likelihoods and the optimum time paths.
    • 通过比较有助于补偿声音单元发音的上下文变化的帧对特征向量来改善声音单元的识别。 存储表示参考词的参考特征向量的多个参考帧。 线性预测编码器(10)为每个语音信号帧生成多个频谱特征向量。 滤波器组系统(12)将频谱特征向量变换为滤波器组表示。 主要特征向量变换器(14)将滤波器组表示转换成变换的输入特征向量的单位矩阵。 级联帧系统(16)连接相邻帧的输入特征向量以形成帧对的特征向量。 变压器(18)和比较器(20)计算每对参考帧产生每一个帧对的输入特征向量的可能性。 对于每个参考帧对,单独且独立地执行该计算。 动态时间整形器(22)通过输入语音信号为每个计算出的可能性构建最佳时间路径。 高电平判定逻辑(24)响应于所计算的似然性和最佳时间路径将输入语音信号识别为参考词之一。