会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 61. 发明申请
    • Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction
    • 使用语音合成和还原的双向目标滤波模型进行语音识别的两阶段实现
    • US20060200351A1
    • 2006-09-07
    • US11069474
    • 2005-03-01
    • Alejandro AceroDong YuLi Deng
    • Alejandro AceroDong YuLi Deng
    • G10L15/04
    • G10L15/02G10L25/15G10L25/24G10L2015/025
    • A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values. This then permits the use of the model for phonetic recognition.
    • 用新的两阶段实现来描述语音合成和简化的结构化生成模型。 在第一阶段,使用电话序列中共振目标的先前信息产生共振峰或声道共振(VTR)的动力学。 具有有限脉冲响应(FIR)的双向时间滤波作为FIR滤波器的输入应用于分段目标序列。 在第二阶段,基于FIR滤波的VTR目标,分析地预测语音cepstra的动力学。 这两个阶段的组合系统因此产生相关和因果相关的VTR和倒谱动力学,其中语音减少在隐藏共振空间中明确表示,并且隐含地在观察到的倒频谱空间中。 组合系统还给出了电话序列的声学观察概率。 使用这种概率,可以根据它们各自的概率值对不同的电话序列进行比较和排序。 这样就允许使用模型进行语音识别。
    • 63. 发明申请
    • Noise reduction using correction vectors based on dynamic aspects of speech and noise normalization
    • 基于语音和噪声归一化的动态方面的校正矢量降噪
    • US20050259558A1
    • 2005-11-24
    • US11189974
    • 2005-07-26
    • James DroppoLi DengAlejandro Acero
    • James DroppoLi DengAlejandro Acero
    • G10L21/02G11B7/00G11B7/24
    • G10L21/0208
    • A method and apparatus are provided for reducing noise in a signal. Under one aspect of the invention, a correction vector is selected based on a noisy feature vector that represents a noisy signal. The selected correction vector incorporates dynamic aspects of pattern signals. The selected correction vector is then added to the noisy feature vector to produce a cleaned feature vector. In other aspects of the invention, a noise value is produced from an estimate of the noise in a noisy signal. The noise value is subtracted from a value representing a portion of the noisy signal to produce a noise-normalized value. The noise-normalized value is used to select a correction value that is added to the noise-normalized value to produce a cleaned noise-normalized value. The noise value is then added to the cleaned noise-normalized value to produce a cleaned value representing a portion of a cleaned signal.
    • 提供了一种降低信号噪声的方法和装置。 在本发明的一个方面,基于表示噪声信号的噪声特征向量来选择校正矢量。 所选择的校正矢量包含模式信号的动态方面。 然后将所选择的校正向量加到噪声特征向量中以产生清除的特征向量。 在本发明的其他方面,噪声值是由噪声信号中的噪声的估计产生的。 从表示噪声信号的一部分的值中减去噪声值,以产生噪声归一化值。 噪声归一化值用于选择加到噪声归一化值的校正值以产生清洁的噪声归一化值。 然后将噪声值添加到清洁的噪声归一化值,以产生表示清洁信号的一部分的清洁值。
    • 68. 发明申请
    • Noise Suppressor for Robust Speech Recognition
    • 噪声抑制器用于强大的语音识别
    • US20100153104A1
    • 2010-06-17
    • US12335558
    • 2008-12-16
    • Dong YuLi DengYifan GongJian WuAlejandro Acero
    • Dong YuLi DengYifan GongJian WuAlejandro Acero
    • G10L15/20
    • G10L21/0208G10L15/20
    • Described is noise reduction technology generally for speech input in which a noise-suppression related gain value for the frame is determined based upon a noise level associated with that frame in addition to the signal to noise ratios (SNRs). In one implementation, a noise reduction mechanism is based upon minimum mean square error, Mel-frequency cepstra noise reduction technology. A high gain value (e.g., one) is set to accomplish little or no noise suppression when the noise level is below a threshold low level, and a low gain value set or computed to accomplish large noise suppression above a threshold high noise level. A noise-power dependent function, e.g., a log-linear interpolation, is used to compute the gain between the thresholds. Smoothing may be performed by modifying the gain value based upon a prior frame's gain value. Also described is learning parameters used in noise reduction via a step-adaptive discriminative learning algorithm.
    • 描述了通常用于语音输入的噪声降低技术,其中除了信噪比(SNR)之外,基于与该帧相关联的噪声电平来确定用于帧的噪声抑制相关增益值。 在一个实现中,降噪机制基于最小均方误差,Mel-frequency cepstra降噪技术。 设置高增益值(例如一个),以在噪声电平低于阈值低电平时实现很少或没有噪声抑制,以及设置或计算的低增益值,以实现高于阈值高噪声电平的大噪声抑制。 使用噪声功率相关函数,例如对数线性插值来计算阈值之间的增益。 可以通过基于先前帧的增益值修改增益值来执行平滑化。 还描述了通过步进自适应识别学习算法在降噪中使用的学习参数。
    • 69. 发明申请
    • PIECEWISE-BASED VARIABLE -PARAMETER HIDDEN MARKOV MODELS AND THE TRAINING THEREOF
    • 基于改进的可变参数隐藏式MARKOV模型及其训练
    • US20100070279A1
    • 2010-03-18
    • US12211114
    • 2008-09-16
    • Dong YuLi DengYifan GongAlejandro Acero
    • Dong YuLi DengYifan GongAlejandro Acero
    • G10L15/14
    • G10L15/144
    • A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech under many different conditions. Each Gaussian mixture component of the VPHMMs is characterized by a mean parameter μ and a variance parameter Σ. Each of these Gaussian parameters varies as a function of at least one environmental conditioning parameter, such as, but not limited to, instantaneous signal-to-noise-ratio (SNR). The way in which a Gaussian parameter varies with the environmental conditioning parameter(s) can be approximated as a piecewise function, such as a cubic spline function. Further, the recognition system formulates the mean parameter μ and the variance parameter Σ of each Gaussian mixture component in an efficient form that accommodates the use of discriminative training and parameter sharing. Parameter sharing is carried out so that the otherwise very large number of parameters in the VPHMMs can be effectively reduced with practically feasible amounts of training data.
    • 语音识别系统使用高斯混合可变参数隐马尔可夫模型(VPHMM)来识别许多不同条件下的语音。 VPHMM的每个高斯混合分量的特征在于平均参数μ和方差参数&Sgr。 这些高斯参数中的每一个作为至少一个环境调节参数的函数而变化,例如但不限于瞬时信噪比(SNR)。 高斯参数随环境条件参数变化的方式可以近似为分段函数,如三次样条函数。 此外,识别系统制定均值参数μ和方差参数&Sgr; 每个高斯混合分量以有效的形式适应使用歧视性训练和参数共享。 执行参数共享,以便通过实际可行的训练数据量可以有效地减少VPHMM中非常大量的参数。