会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Closed-loop multimode mixed-domain linear prediction (MDLP) speech coder
    • 闭环多模混合域线性预测(MDLP)语音编码器
    • US06640209B1
    • 2003-10-28
    • US09259151
    • 1999-02-26
    • Amitava Das
    • Amitava Das
    • G10L1904
    • G10L19/18
    • A closed-loop, multimode, mixed-domain linear prediction (MDLP) speech coder includes a high-rate, time-domain coding mode, a low-rate, frequency-domain coding mode, and a closed-loop mode-selection mechanism for selecting a coding mode for the coder based upon the speech content of frames input to the coder. Transition speech (i.e., from unvoiced speech to voiced speech, or vice versa) frames are encoded with the high-rate, time-domain coding mode, which may be a CELP coding mode. Voiced speech frames are encoded with the low-rate, frequency-domain coding mode, which may be a harmonic coding mode. Phase parameters are not encoded by the frequency-domain coding mode, and are instead modeled in accordance with, e.g., a quadratic phase model. For each speech frame encoded with the frequency-domain coding mode, the initial phase value is taken to be the initial phase value of the immediately preceding speech frame encoded with the frequency-domain coding mode. If the immediately preceding speech frame was encoded with the time-domain coding mode, the initial phase value of the current speech frame is computed from the decoded speech frame information of the immediately preceding, time-domain-encoded speech frame. Each speech frame encoded with the frequency-domain coding mode may be compared with the corresponding input speech frame to obtain a performance measure. If the performance measure falls below a predefined threshold value, the input speech frame is encoded with the time-domain coding mode.
    • 闭环多模混合域线性预测(MDLP)语音编码器包括高速率时域编码模式,低速率频域编码模式和闭环模式选择机制,用于 基于输入到编码器的帧的语音内容,为编码器选择编码模式。 以可能是CELP编码模式的高速率时域编码模式来编码转换语音(即,从无声语音到浊音,或反之亦然)帧。 语音帧以低速,频域编码模式进行编码,频率编码模式可以是谐波编码模式。 相位参数不是由频域编码模式编码的,而是根据例如二次相位模型进行建模。 对于以频域编码模式编码的每个语音帧,初始相位值被认为是用频域编码模式编码的紧接在前的语音帧的初始相位值。 如果紧接在前的语音帧用时域编码模式进行编码,则从紧接在前的时域编码语音帧的解码语音帧信息计算当前语音帧的初始相位值。 可以将用频域编码模式编码的每个语音帧与相应的输入语音帧进行比较以获得性能测量。 如果性能测量值低于预定义的阈值,则用时域编码模式对输入语音帧进行编码。
    • 3. 发明申请
    • SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS
    • 基于多个邻近分类器的声音识别语音识别
    • US20120016673A1
    • 2012-01-19
    • US13246681
    • 2011-09-27
    • Amitava Das
    • Amitava Das
    • G10L17/00
    • G10L17/10
    • A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.
    • 扬声器识别系统产生码本商店,其代码书代表讲话者的语音样本,称为培训者。 扬声器识别系统可以使用多个分类器并为每个分类器生成码本存储。 每个分类器使用语音样本的不同特征集作为其特征。 分类器输入人的声音样本,并尝试认证或识别人。 分类器产生输入语音样本的特征向量序列,然后生成该序列的码矢量。 分类器使用其码本存储来识别该人。 扬声器识别系统然后将分类器的分数组合以产生总分。 如果分数满足识别标准,则说话者识别系统指示语音样本来自该说话者。
    • 4. 发明授权
    • Speaker recognition via voice sample based on multiple nearest neighbor classifiers
    • 基于多个最近邻分类器的语音样本的扬声器识别
    • US08050919B2
    • 2011-11-01
    • US11771794
    • 2007-06-29
    • Amitava Das
    • Amitava Das
    • G10L17/00
    • G10L17/10
    • A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.
    • 扬声器识别系统产生码本商店,其代码书代表讲话者的语音样本,称为培训者。 扬声器识别系统可以使用多个分类器并为每个分类器生成码本存储。 每个分类器使用语音样本的不同特征集作为其特征。 分类器输入人的声音样本,并尝试认证或识别人。 分类器产生输入语音样本的特征向量序列,然后生成该序列的码矢量。 分类器使用其码本存储来识别该人。 扬声器识别系统然后将分类器的分数组合以产生总分。 如果分数满足识别标准,则说话者识别系统指示语音样本来自该说话者。
    • 5. 发明申请
    • SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS
    • 基于多个邻近分类器的声音识别语音识别
    • US20090006093A1
    • 2009-01-01
    • US11771794
    • 2007-06-29
    • Amitava Das
    • Amitava Das
    • G10L15/00G10L17/00
    • G10L17/10
    • A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.
    • 扬声器识别系统产生码本商店,其代码书代表讲话者的语音样本,称为培训者。 扬声器识别系统可以使用多个分类器并为每个分类器生成码本存储。 每个分类器使用语音样本的不同特征集作为其特征。 分类器输入人的声音样本,并尝试认证或识别人。 分类器产生输入语音样本的特征向量序列,然后生成该序列的码矢量。 分类器使用其码本存储来识别该人。 扬声器识别系统然后将分类器的分数组合以产生总分。 如果分数满足识别标准,则说话者识别系统指示语音样本来自该说话者。
    • 8. 发明授权
    • Method and apparatus for tracking the phase of a quasi-periodic signal
    • 用于跟踪准周期信号相位的方法和装置
    • US06449592B1
    • 2002-09-10
    • US09259247
    • 1999-02-26
    • Amitava Das
    • Amitava Das
    • G10L1914
    • G10L19/02
    • A method for tracking the phase of a quasi-periodic signal includes the steps of estimating the phase of the signal for frames during which the signal is periodic, monitoring the performance of the estimated phase with a closed-loop performance measure, and measuring the phase of the signal for frames during which the signal is periodic and performance of the estimated phase falls below a predefined threshold level. In estimating the phase, the initial phase value is set equal to the estimated final phase value of the previous frame if the previous frame was periodic. The initial phase value is set equal to a measured phase value of the previous frame if the previous frame was nonperiodic, or if the previous frame was periodic and performance of the estimated phase for the previous frame fell below the predefined threshold level. For frames during which the signal is nonperiodic, the phase of the signal is measured. An open-loop periodicity decision can be used to determine whether the signal is periodic for a given frame.
    • 用于跟踪准周期信号的相位的方法包括以下步骤:估计信号周期的帧的信号的相位,通过闭环性能测量来监测估计相位的性能,以及测量相位 信号是信号周期性的信号,并且估计相位的性能下降到预定阈值以下。 在估计相位时,如果先前帧是周期性的,初始相位值被设置为等于前一帧的估计最终相位值。 如果前一帧是非周期性的,或者如果前一帧是周期性的并且前一帧的估计相位的性能下降到预定阈值水平以下,则将初始相位值设置为等于前一帧的测量相位值。 对于信号为非周期性的帧,测量信号的相位。 可以使用开环周期性决定来确定信号对于给定帧是否是周期性的。
    • 9. 发明授权
    • Multipulse interpolative coding of transition speech frames
    • 过渡语音帧的多脉冲内插编码
    • US06260017B1
    • 2001-07-10
    • US09307294
    • 1999-05-07
    • Amitava DasSharath Manjunath
    • Amitava DasSharath Manjunath
    • G10L1910
    • G10L19/18G10L19/10
    • A multipulse interpolative coder for transition speech frames includes an extractor configured to represent a first frame of transitional speech samples by a subset of the samples of the frame. The coder also includes an interpolator configured to interpolate the subset of samples and a subset of samples extracted from an earlier-received frame to synthesize other samples of the first frame that are not included in the subset. The subset of samples is further simplified by selecting a set of pulses from the subset and assigning zero values to unselected pulses. In the alternative, a portion of the unselected pulses may be quantized. The set of pulses may be the pulses having the greatest absolute amplitudes in the subset. In the alternative, the set of pulses may be the most perceptually significant pulses of the subset.
    • 用于转换语音帧的多脉冲内插编码器包括提取器,其被配置为通过帧的样本的子集来表示过渡语音样本的第一帧。 编码器还包括被配置为内插样本子集的内插器和从较早接收的帧提取的样本的子集,以合成不包括在子集中的第一帧的其他样本。 通过从子集中选择一组脉冲并将零值分配给未选择的脉冲来进一步简化样本子集。 在替代方案中,可以对未选择的脉冲的一部分进行量化。 该组脉冲可以是子集中绝对幅度最大的脉冲。 在替代方案中,该组脉冲可以是该子集中最感知的显着脉冲。
    • 10. 发明授权
    • Object identification and verification using transform vector quantization
    • 使用变换矢量量化的对象识别和验证
    • US07991199B2
    • 2011-08-02
    • US11771879
    • 2007-06-29
    • Amitava Das
    • Amitava Das
    • G06K9/00
    • G06K9/6223G06K9/6272
    • An identification system uses mappings of known objects to codebooks representing those objects to identify an object represented by multiple input representations or to verify that an input representation corresponds to an input known object. To identify the object, the identification system generates an input feature vector for each input representation. The identification system then accumulates for each known object the distances between the codebook of that object and each of the input feature vectors. The distance between a codebook and a feature vector may be the minimum of the distances between the code vectors of the codebook and the feature vector. The identification system then selects the object with the smallest accumulated distance as being the object represented by the multiple input representations.
    • 识别系统使用已知对象的映射来代表这些对象的码本来识别由多个输入表示表示的对象或者验证输入表示对应于输入的已知对象。 为了识别对象,识别系统为每个输入表示生成输入特征向量。 识别系统然后为每个已知对象累积该对象的码本与每个输入特征向量之间的距离。 码本和特征向量之间的距离可以是码本的代码矢量与特征向量之间的距离的最小值。 识别系统然后选择具有最小累积距离的对象作为由多个输入表示表示的对象。