会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 73. 发明授权
    • Speaker adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation
    • 演讲者自适应学习的共振目标在隐藏轨迹模型的语音合成
    • US07519531B2
    • 2009-04-14
    • US11093833
    • 2005-03-30
    • Alejandro AceroDong YuLi Deng
    • Alejandro AceroDong YuLi Deng
    • G10L19/06
    • G10L15/07G10L2015/0638
    • A computer-implemented method is provided for training a hidden trajectory model, of a speech recognition system, which generates Vocal Tract Resonance (VTR) targets. The method includes obtaining generic VTR target parameters corresponding to a generic speaker used by a target selector to generate VTR target sequences. The generic VTR target parameters are scaled for a particular speaker using a speaker-dependent scaling factor for the particular speaker to generate speaker-adaptive VTR target parameters. This scaling is performed for both the training data and the test data, and for the training data, the scaling is performed iteratively with the process of obtaining the generic targets. The computation of the scaling factor makes use of the results of a VTR tracker. The speaker-adaptive VTR target parameters for the particular speaker are then stored in order to configure the hidden trajectory model to perform speech recognition for the particular speaker using the speaker-adaptive VTR target parameters.
    • 提供了一种计算机实现的方法,用于训练产生声音轨道共振(VTR)目标的语音识别系统的隐藏轨迹模型。 该方法包括获得对应于由目标选择器使用的通用扬声器生成VTR目标序列的通用VTR目标参数。 使用与特定扬声器相关的扬声器相关的缩放因子来为特定扬声器对通用VTR目标参数进行缩放以产生说话者自适应VTR目标参数。 对训练数据和测试数据进行该缩放,对于训练数据,通过获得通用目标的过程迭代地执行缩放。 缩放因子的计算使用VTR跟踪器的结果。 然后存储用于特定扬声器的扬声器自适应VTR目标参数,以便配置隐藏轨迹模型,以使用扬声器自适应VTR目标参数为特定扬声器执行语音识别。
    • 74. 发明申请
    • METHOD OF PATTERN RECOGNITION USING NOISE REDUCTION UNCERTAINTY
    • 使用噪声减少不确定度的图案识别方法
    • US20080281591A1
    • 2008-11-13
    • US12180260
    • 2008-07-25
    • James G. DroppoAlejandro AceroLi Deng
    • James G. DroppoAlejandro AceroLi Deng
    • G10L15/20
    • G10L21/0208G10L15/20
    • A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.
    • 提供了一种在模式识别期间使用噪声去除处理的不确定性的方法和装置。 特别地,从噪声信号的一部分的表示中去除噪声以产生清洁信号的表示。 同时,计算与噪声去除有关的不确定性,并与清除信号的表示一起使用以修改识别系统中语音状态的概率。 在特定实施例中,不确定性用于通过将每个高斯分布中的方差增加等于在模式识别任务中对语音状态序列进行解码所使用的清除信号的估计方差的量来修改概率分布。
    • 75. 发明申请
    • Generic framework for large-margin MCE training in speech recognition
    • 语言识别中大面积MCE培训的通用框架
    • US20080201139A1
    • 2008-08-21
    • US11708440
    • 2007-02-20
    • Dong YuAlejandro AceroLi DengXiaodong He
    • Dong YuAlejandro AceroLi DengXiaodong He
    • G10L15/00
    • G10L15/063G10L2015/0631
    • A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.
    • 公开了一种用于训练声学模型的方法和装置。 训练语料库被访问并转换成初始声学模型。 对于给定初始声学模型的每个令牌,分数计算分别为正确的类和竞争类。 此外,针对每个训练令牌计算样本自适应窗口带宽。 从计算出的分数和采样自适应窗口带宽值,根据损失函数计算损失值。 可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值,使得靠近判定边界的正确令牌的令牌到边界的距离最大化。 边距可以是固定边距,也可以作为算法迭代的函数单调变化。 基于计算的损失值更新声学模型。 可以重复该过程,直到满足经验收敛。
    • 77. 发明授权
    • Method and apparatus for identifying noise environments from noisy signals
    • 用于从噪声信号中识别噪声环境的方法和装置
    • US07266494B2
    • 2007-09-04
    • US10985896
    • 2004-11-10
    • James G. DroppoAlejandro AceroLi Deng
    • James G. DroppoAlejandro AceroLi Deng
    • G10L21/02
    • G10L21/0208G10L15/20G10L21/0216
    • A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. To identify the noise environment, a probability for a noise environment is determined by applying the noisy input feature vector to a distribution of noisy training feature vectors. In one embodiment, each noisy training feature vector in the distribution is formed by modifying a set of clean training feature vectors. In one embodiment, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one embodiment, a correction vector is then selected based on the identified noise environment.
    • 提供了一种方法和装置,用于基于该帧的至少一个特征来识别输入信号的帧的噪声环境。 为了识别噪声环境,通过将噪声输入特征向量应用于噪声训练特征向量的分布来确定噪声环境的概率。 在一个实施例中,通过修改一组干净的训练特征向量来形成分布中的每个噪声训练特征向量。 在一个实施例中,过去帧的噪声环境的概率被包括在当前帧的环境的识别中。 在一个实施例中,然后基于所识别的噪声环境来选择校正矢量。