会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 5. 发明申请
    • Method and apparatus for continuous valued vocal tract resonance tracking using piecewise linear approximations
    • 使用分段线性近似的连续值声道共振跟踪的方法和装置
    • US20050114134A1
    • 2005-05-26
    • US10723995
    • 2003-11-26
    • Li DengHagai AttiasAlejandro AceroLeo Lee
    • Li DengHagai AttiasAlejandro AceroLeo Lee
    • G10L15/10G10L11/00G10L15/02G10L15/14G10L15/28G10L19/06
    • G10L25/48G10L25/15
    • A method and apparatus tracks vocal tract resonance components, including both frequencies and bandwidths, in a speech signal. The components are tracked by defining a state equation that is linear with respect to a past vocal tract resonance vector and that predicts a current vocal tract resonance vector. An observation equation is also defined that is linear with respect to a current vocal tract resonance vector and that predicts at least one component of an observation vector. The state equation, the observation equation, and a sequence of observation vectors are used to identify a sequence of vocal tract resonance vectors using Kalman filter algorithm. Under one embodiment, the observation equation is defined based on a piecewise linear approximation to a non-linear function. The parameters of the linear approximation are selected based on pre-defined regions, which are determined from a crude estimate of a vocal tract resonance vector.
    • 一种方法和装置在语音信号中跟踪声道共振分量,包括频率和频带两者。 通过定义相对于过去声道共振矢量线性的状态方程并且预测当前声道共振矢量来跟踪组件。 还定义了相对于当前声道共振矢量是线性的并且预测观察矢量的至少一个分量的观察方程。 状态方程,观察方程和观察矢量序列用于使用卡尔曼滤波算法识别声道共振矢量序列。 在一个实施例中,基于对非线性函数的分段线性近似来定义观察方程。 基于由声道共振矢量的粗略估计确定的预定义区域来选择线性近似的参数。
    • 6. 发明申请
    • GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA
    • 使用声学数据的图形到电声转换
    • US20110251844A1
    • 2011-10-13
    • US13164683
    • 2011-06-20
    • Xiao LiAsela J. R. GunawardanaAlejandro Acero
    • Xiao LiAsela J. R. GunawardanaAlejandro Acero
    • G10L15/04
    • G10L13/08G10L15/063G10L15/187
    • Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.
    • 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和鉴别训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。
    • 7. 发明申请
    • GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA
    • 使用声学数据的图形到电声转换
    • US20090150153A1
    • 2009-06-11
    • US11952267
    • 2007-12-07
    • Xiao LiAsela J. R. GunawardanaAlejandro Acero
    • Xiao LiAsela J. R. GunawardanaAlejandro Acero
    • G10L15/00
    • G10L13/08G10L15/063G10L15/187
    • Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.
    • 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和辨别性训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。
    • 10. 发明授权
    • Grapheme-to-phoneme conversion using acoustic data
    • 使用声学数据的语音对音素转换
    • US07991615B2
    • 2011-08-02
    • US11952267
    • 2007-12-07
    • Xiao LiAsela J. R. GunawardanaAlejandro Acero
    • Xiao LiAsela J. R. GunawardanaAlejandro Acero
    • G10L15/04
    • G10L13/08G10L15/063G10L15/187
    • Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.
    • 描述了使用声学数据来改进用于语音识别的字形到音素转换,例如更准确地识别语音拨号系统中的语音名称。 描述了声学和图形(声学数据,音素序列,字形序列以及音素序列和图形序列之间的对齐)的联合模型,正如通过使用声学数据适应图形模型参数的最大似然训练和鉴别训练来重新训练。 还描述了用于接收的声学数据的无监督的字母标签集合,从而自动获得可用于再培训的大量实际样本。 不满足置信阈值的语音输入可以被滤除,以便不被再培训的模型使用。