会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Self-learning speaker adaptation based on spectral variation source
decomposition
    • 基于光谱变化源分解的自学习扬声器适应
    • US5664059A
    • 1997-09-02
    • US710361
    • 1996-09-16
    • Yunxin Zhao
    • Yunxin Zhao
    • G10L5/06G10L7/08
    • G10L15/07G10L15/144
    • A self-learning speaker adaptation method for automatic speech recognition is provided. The method includes building a plurality of Gaussian mixture density phone models for use in recognizing speech. The Gaussian mixture density phone models are used to recognize a first utterance of speech from a given speaker. After the first utterance of speech has been recognized, the recognized first utterance of speech is used to adapt the Gaussian mixture density hone models for use in recognizing a subsequent utterance of speech from that same speaker, whereby the Gaussian mixture density phone models are automatically adapted to that speaker in self-learning fashion to thereby produce a plurality of adapted Gaussian mixture density phone models.
    • 提供了一种用于自动语音识别的自学习扬声器适配方法。 该方法包括构建用于识别语音的多个高斯混合密度电话模型。 高斯混合密度手机模型用于识别来自给定说话者的第一话语。 在第一语言发音被识别之后,所识别的第一话语被用于使高斯混合密度磨练模型适应于用于识别来自同一说话者的语音的后续话语,由此高斯混合密度手机模型被自动适应 以自学习的方式向该扬声器发送,从而产生多个适应的高斯混合密度手机模型。
    • 3. 发明授权
    • Training module for estimating mixture Gaussian densities for speech
unit models in speech recognition systems
    • 用于在语音识别系统中为语音单元模型估计混合高斯密度的训练模块
    • US5450523A
    • 1995-09-12
    • US71334
    • 1993-06-01
    • Yunxin Zhao
    • Yunxin Zhao
    • G10L9/00
    • A model-training module generates mixture Gaussian density models from speech training data for continuous, or isolated word speech recognition systems. Speech feature sequences are labeled into segments of states of speech units using Viterbi-decoding based optimized segmentation algorithm. Each segment is modeled by a Gaussian density, and the parameters are estimated by sample mean and sample covariance. A mixture Gaussian density is generated for each state of each speech unit by merging the Gaussian densities of all the segments with the same corresponding label. The resulting number of mixture components is proportional to the dispersion and sample size of the training data. A single, fully merged, Gaussian density is also generated for each state of each speech unit. The covariance matrices of the mixture components are selectively smoothed by a measure of relative sharpness of the Gaussian density and the smoothing can also be done blockwise. The weights of the mixture components are set uniformly initially, and are reestimated using a segmental-average procedure. The weighting coefficients, together with the Gaussian densities, then become the models of speech units for use in speech recognition.
    • 模型训练模块从用于连续或隔离的单词语音识别系统的语音训练数据生成混合高斯密度模型。 语音特征序列使用基于维特比解码的优化分割算法被标记成语音单元的状态段。 每个段由高斯密度建模,参数通过样本均值和样本协方差估计。 通过将所有段的高斯密度与相同的相应标号进行合并,为每个语音单元的每个状态生成混合高斯密度。 所得到的混合物组分数量与训练数据的色散和样品量成比例。 也为每个语音单元的每个状态生成单个,完全合并的高斯密度。 混合分量的协方差矩阵通过高斯密度的相对清晰度的度量来选择性地平滑,并且平滑也可以是块状的。 混合组分的重量最初均匀地设定,并使用分段平均程序进行重新估计。 加权系数与高斯密度一起成为用于语音识别的语音单元的模型。
    • 4. 发明授权
    • Word hypothesizer for continuous speech decoding using stressed-vowel
centered bidirectional tree searches
    • 用于连续语音解码的Word假设,使用以紧密元音为中心的双向树搜索
    • US5349645A
    • 1994-09-20
    • US807255
    • 1991-12-31
    • Yunxin Zhao
    • Yunxin Zhao
    • G10L15/14G10L9/00
    • G10L15/142
    • A word hypothesis module for speech decoding consists of four submodules: vowel center detection, bidirectional tree searches around each vowel center, forward-backward pruning, and additional short words hypotheses. By detecting the strong energy vowel centers, a vowel-centered lexicon tree can be placed at each vowel center and searches can be performed in both the left and right directions, where only simple phone models are used for fast acoustic match. A stage-wise forward-backward technique computes the word-beginning and word-ending likelihood scores over the generated half-word lattice for further pruning of the lattice. To avoid potential miss of short words with weak energy vowel centers, a lexicon tree is compiled for these words and tree searches are performed between each pair of adjacent vowel centers. The integration of the word hypothesizer with a top-down Viterbi beam search in continuous speech decoding provides two-pass decoding which significantly reduces computation time.
    • 用于语音解码的单词假设模块由四个子模块组成:元音中心检测,每个元音中心周围的双向树搜索,前向后向修剪和其他短字假设。 通过检测强能元音中心,可以在每个元音中心放置一个以元音为中心的词典树,并且可以在左右两个方向进行搜索,仅使用简单的手机模型进行快速声匹配。 逐级向前 - 反向技术计算生成的半字格阵上的词开始和词结合似然分数,用于进一步修剪格。 为了避免潜在的缺乏弱能量元音中心的短语,为这些单词编译了一个词典树,并且在每对相邻的元音中心之间执行树搜索。 在连续语音解码中,单词假设与自顶向下维特比波束搜索的集成提供了双遍解码,这显着减少了计算时间。
    • 9. 发明授权
    • Training module for estimating mixture gaussian densities for
speech-unit models in speech recognition systems
    • 用于语音识别系统中语音单元模型的混合高斯密度的训练模块
    • US5193142A
    • 1993-03-09
    • US613352
    • 1990-11-15
    • Yunxin Zhao
    • Yunxin Zhao
    • G10L15/14
    • G10L15/144
    • A model-training module generates mixture Gaussian density models from speech training data for continuous, or isolated word HMM-based speech recognition systems. Speech feature sequences are labeled into segments of states of speech units using Viterbi-decoding based optimized segmentation algorithm. Each segment is modeled by a Gaussian density, and the parameters are estimated by sample mean and sample covariance. A mixture Gaussian density is generated for each state of each speech unit by merging the Gaussian densities of all the segments with the same corresponding label. The resulting number of mixture components is proportional to the dispension and sample size of the training data. A single, fully merged, Gaussian density is also generated for each state of each speech unit. The covariance matrices of the mixture components are selectively smoothed by a measure of relative sharpness of the Gaussian density. The weights of the mixture components are set uniformly initially, and are reestimated using a segmental-average procedure. The weighting coefficients, together with the Gaussian densities, then become the models of speech units for use in speech recognition.
    • 模型训练模块从语音训练数据生成用于连续或隔离词HMM的语音识别系统的混合高斯密度模型。 语音特征序列使用基于维特比解码的优化分割算法被标记成语音单元的状态段。 每个段由高斯密度建模,参数通过样本均值和样本协方差估计。 通过将所有段的高斯密度与相同的相应标号进行合并,为每个语音单元的每个状态生成混合高斯密度。 所得到的混合物组分数量与培训数据的分布和样本量成比例。 也为每个语音单元的每个状态生成单个,完全合并的高斯密度。 通过测量高斯密度的相对锐度来选择性地平滑混合分量的协方差矩阵。 混合组分的重量最初均匀地设定,并使用分段平均程序进行重新估计。 加权系数与高斯密度一起成为用于语音识别的语音单元的模型。