会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Voiced/unvoiced speech classifier
    • 有声/无声语音分类器
    • US06640208B1
    • 2003-10-28
    • US09659318
    • 2000-09-12
    • Yaxin ZhangJianming SongAnton Madievski
    • Yaxin ZhangJianming SongAnton Madievski
    • G10L1106
    • G10L25/93
    • A voiced/unvoiced speech classifier (30) includes a speech segmentor (34) which segments an input digitized speech waveform into frames of speech and a band-pass filter (36) which filters the frames of speech. A relative energy generator (38) generates a relative energy value for each filtered frame of speech and a decision parameter generator (52) including an autocorrelation calculator (54) and a pitch calculator (56) generates a decision parameter based on an autocorrelation function and a pitch frequency index for the filtered frames of speech. A normalized energy calculator (46) adjusts the threshold and then normalizes the relative energy. A comparator (60) provides a signal indicative of whether a frame of speech is voiced speech or unvoiced speech depending on a comparison of the decision parameter and the normalized relative energy value for each filtered frame of speech.
    • 有声/无声语音分类器(30)包括将输入的数字化语音波形分成语音帧的语音分割器(34)和对语音帧进行滤波的带通滤波器(36)。 相对能量发生器(38)为每个经滤波的语音帧产生相对能量值,并且包括自相关计算器(54)和音高计算器(56)的判定参数发生器(52)基于自相关函数产生决策参数,并且 用于滤波的语音帧的音调频率索引。 归一化能量计算器(46)调整阈值,然后使相对能量归一化。 比较器(60)根据决定参数与每个被滤波的语音帧的归一化相对能量值的比较,提供指示语音帧是语音语音还是无声语音的信号。
    • 2. 发明授权
    • Tone based speech recognition
    • 基于语音识别
    • US06553342B1
    • 2003-04-22
    • US09496868
    • 2000-02-02
    • Yaxin ZhangJianming SongAnton Madievski
    • Yaxin ZhangJianming SongAnton Madievski
    • G10L1502
    • G10L15/02G10L25/15
    • A method and apparatus for speech recognition involves classifying (38) a digitized speech segment according to whether the speech segment comprises voiced or unvoiced speech and utilizing that classification to generate tonal feature vectors (41) of the speech segment when the speech is voiced. The tonal feature vectors are then combined (42) with other non-tonal feature vectors (40) to provide speech feature vectors. The speech feature vectors are compared (35) with previously stored models of speech feature vectors (37) for different segments of speech to determine which previously stored model is a most likely match for the segment to be recognized.
    • 用于语音识别的方法和装置涉及根据语音段是否包括有声或无声语音来分类(38)数字化语音段,并且当语音被语音时利用该分类来生成语音段的音调特征向量(41)。 然后将音调特征向量与其他非音调特征向量(40)组合(42)以提供语音特征向量。 将语音特征向量与先前存储的用于不同语音段的语音特征向量(37)的模型进行比较(35),以确定先前存储的模型是否将被识别的段最可能匹配。
    • 3. 发明申请
    • Method and apparatus of increasing speech intelligibility in noisy environments
    • 在嘈杂环境中增加语音清晰度的方法和设备
    • US20060270467A1
    • 2006-11-30
    • US11137182
    • 2005-05-25
    • Jianming SongJohn Johnson
    • Jianming SongJohn Johnson
    • H04B1/38H04M1/00
    • H03G3/3089G10L21/0208G10L21/0232G10L25/15H04M1/6025
    • A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.
    • 一种用于增强发射到嘈杂环境中的语音的可懂度的方法(400,600,700)和装置(220)。 在利用模拟语音通信设备(102)的至少一部分来模拟噪声的物理阻塞的滤波器(304)对环境噪声进行滤波(408)之后,计算接收到的语音音频相对于环境噪声的频率相关SNR(424 )在感知(例如树皮)频率标度上。 识别共振峰(426,600,700),并且包括某些共振峰的频带中的SNR被修改(508,510),具有共振峰增强增益因子,以便提高可懂度。 将一组高通滤波器增益(338)与共振峰增强增益因子组合(516),产生组合增益,该组合增益根据总SNR进行削波(518),缩放(520),标准化(526),跨越时间平滑 530)和频率(532),并用于重建(532,534)音频信号。
    • 5. 发明申请
    • Method of refining statistical pattern recognition models and statistical pattern recognizers
    • 统计模式识别模型和统计模式识别方法
    • US20060136205A1
    • 2006-06-22
    • US11018271
    • 2004-12-21
    • Jianming Song
    • Jianming Song
    • G10L15/06
    • G10L15/063G06K9/6277G10L2015/0635
    • A device (800) performs statistical pattern recognition using model parameters that are refined by optimizing an objective function that includes a term for many items of training data for which recognition errors occur wherein each term depends on a relative magnitude of a first score for a recognition result for an item of training data and a second score calculated by evaluating a statistical pattern recognition model identified by a transcribed identity of the training data item with feature vectors extracted from the item of training data. The objective function does not include terms for items of training data for which there is a gross discrepancy between a transcribed identity and a recognized identity. Gross discrepancies can be detected by probability score or pattern identity comparisons. Terms, of the objective function are weighted based on the type of recognition error and weights can be increased for high priority patterns.
    • 设备(800)使用通过优化目标函数来改进的模型参数来执行统计模式识别,所述目标函数包括用于识别错误发生的许多训练数据项的项,其中每个项取决于用于识别的第一分数的相对大小 通过从训练数据项目提取的特征向量评估由训练数据项的转录身份识别的统计模式识别模型而计算出的训练数据项目和第二分数。 目标函数不包括训练数据项,其中转录身份与识别身份之间存在严重差异。 总差异可以通过概率分数或模式识别比较来检测。 根据识别误差的类型对目标函数的术语进行加权,对于高优先级模式,可以增加权重。
    • 6. 发明授权
    • Cohort model selection apparatus and method
    • 队列模型选择装置及方法
    • US06393397B1
    • 2002-05-21
    • US09332927
    • 1999-06-14
    • Ho Chuen ChoiXiaoyuan ZhuJianming Song
    • Ho Chuen ChoiXiaoyuan ZhuJianming Song
    • G10L1506
    • G10L17/04G10L17/12
    • An apparatus for selecting a cohort model for use in a speaker verification system includes a model generator (108) for determining a target speaker model (114) from a speech sample collected from the target speaker (106). A cohort selector (110) determines a similarity value between each of a number of predetermined existing speaker models from a model pool (112) and the target speaker model (114) and a dissimilarity value between each of the existing speaker models and any previously selected cohort models (116). An existing speaker model which is most similar to the target speaker model, but most dissimilar to previously chosen cohort models, is then chosen as another cohort model for the target speaker.
    • 一种用于选择在扬声器验证系统中使用的队列模型的装置包括:模型发生器(108),用于从从目标扬声器(106)收集的语音样本中确定目标说话者模型(114)。 队列选择器(110)确定来自模型池(112)和目标说话者模型(114)的多个预定的现有说话者模型中的每一个之间的相似度值,以及现有说话者模型中的每一者与之前选择的任何一个之间的相似度值 队列模型(116)。 然而,与目标说话者模型最相似但与以前选择的队列模型最相似的现有说话者模型被选择为目标说话者的另一队列模型。