会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明申请
    • System and Method for Making a User Dependent Language Model
    • 制作用户依赖语言模型的系统和方法
    • US20100145677A1
    • 2010-06-10
    • US12396933
    • 2009-03-03
    • Chang-Qing Shu
    • Chang-Qing Shu
    • G06F17/27G06F17/30
    • G06F16/313
    • A language model for a speech recognition engine is made based on user-viewed data files. The data files are reviewed and texts are extracted therefrom. The language model is generated based on the extracted texts. Transcriptions of previous user statements are not required. Different weighting factors can be applied to elements of the extracted texts based on the nature of the data files. The weighting factors are then considered during generation of the language model. A user dependent and application independent language model can be created prior to initial use of the speech recognition engine.
    • 语音识别引擎的语言模型基于用户观看的数据文件进行。 审查数据文件并从中提取文本。 语言模型基于提取的文本生成。 以前的用户语句的转录不是必需的。 基于数据文件的性质,可以将不同的加权因子应用于提取文本的元素。 然后在语言模型的生成期间考虑加权因子。 可以在语音识别引擎的初始使用之前创建用户依赖的和与应用无关的语言模型。
    • 4. 发明授权
    • Using word confidence score, insertion and substitution thresholds for selected words in speech recognition
    • 在语音识别中使用单词置信分数,所选词的插入和替换阈值
    • US09478218B2
    • 2016-10-25
    • US12258093
    • 2008-10-24
    • Chang-Qing Shu
    • Chang-Qing Shu
    • G10L15/00G10L15/187
    • G10L15/01G10L15/187G10L25/51
    • A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
    • 介绍了一种使用单词置信分数(WCS)处理来提高语音识别系统精度的方法和系统。 选择解码器中的参数以使加权的总错误率最小化,使得删除错误的加权比替代和插入错误更重。 WCS中的出现分布取决于单词是否正确识别,并根据错误类型而不同。 这用于确定WCS中插入和替换错误的阈值。 通过处理假设词(HYP)(解码器的输出),确定mHYP(改进的HYP)。 在某些情况下,根据WCS关于插入和替换阈值的值,将mHYP设置为等于null,替代HYP或HYP。
    • 5. 发明申请
    • USING WORD CONFIDENCE SCORE, INSERTION AND SUBSTITUTION THRESHOLDS FOR SELECTED WORDS IN SPEECH RECOGNITION
    • 在语音识别中使用字信心,插入和替代选择语言
    • US20100106505A1
    • 2010-04-29
    • US12258093
    • 2008-10-24
    • Chang-Qing Shu
    • Chang-Qing Shu
    • G10L15/00
    • G10L15/01G10L15/187G10L25/51
    • A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
    • 介绍了一种使用单词置信分数(WCS)处理来提高语音识别系统精度的方法和系统。 选择解码器中的参数以使加权的总错误率最小化,使得删除错误的加权比替代和插入错误更重。 WCS中的出现分布取决于单词是否正确识别,并根据错误的类型而不同。 这用于确定WCS中插入和替换错误的阈值。 通过处理假设词(HYP)(解码器的输出),确定mHYP(改进的HYP)。 在某些情况下,根据WCS关于插入和替换阈值的值,将mHYP设置为等于null,替代HYP或HYP。
    • 6. 发明授权
    • System and method for training an acoustic model with reduced feature space variation
    • 用于训练具有减小的特征空间变化的声学模型的系统和方法
    • US08301446B2
    • 2012-10-30
    • US12413896
    • 2009-03-30
    • Chang-Qing Shu
    • Chang-Qing Shu
    • G10L15/00
    • G10L15/187G10L15/063G10L2015/025
    • Feature space variation associated with specific text elements is reduced by training an acoustic model with a phoneme set, dictionary and transcription set configured to better distinguish the specific text elements and at least some specific phonemes associated therewith. The specific text elements can include the most frequently occurring text elements from a text data set, which can include text data beyond the transcriptions of a training data set. The specific text elements can be identified using a text element distribution table sorted by occurrence within the text data set. Specific phonemes can be limited to consonant phonemes to improve speed and accuracy.
    • 通过用配置为更好地区分特定文本元素和与其相关联的至少一些特定音素的音素集合,字典和转录集来训练声学模型来减少与特定文本元素相关联的特征空间变化。 特定文本元素可以包括来自文本数据集的最常出现的文本元素,其可以包括超出训练数据集的转录的文本数据。 可以使用由文本数据集中的出现排序的文本元素分布表来识别特定文本元素。 特定的音素可以限于辅音音素,以提高速度和准确性。
    • 7. 发明授权
    • Systems and methods for implementing segmentation in speech recognition systems
    • 在语音识别系统中实现分割的系统和方法
    • US07680662B2
    • 2010-03-16
    • US11129254
    • 2005-05-13
    • Chang-Qing ShuHan Shu
    • Chang-Qing ShuHan Shu
    • G10L15/00
    • G10L15/142G10L15/04G10L2015/025
    • A speech recognition system (105) includes an acoustic front end (115) and a processing unit (125). The acoustic front end (115) receives frames of acoustic data and determines cepstral coefficients for each of the received frames. The processing unit (125) determines a number of peaks in the cepstral coefficients for each of the received frames of acoustic data and compares the peaks in the cepstral coefficients of a first one of the received frames with the peaks in the cepstral coefficients of at least a second one of the received frames. The processing unit (125) then segments the received frames of acoustic data based on the comparison.
    • 语音识别系统(105)包括声前端(115)和处理单元(125)。 声学前端(115)接收声学数据帧,并确定每个接收帧的倒谱系数。 处理单元(125)确定每个接收的声学数据帧的倒谱系数中的峰值数量,并将接收到的帧中的第一个的倒谱系数中的峰值与至少在倒频谱系数中的峰值进行比较 接收帧中的第二个。 然后,处理单元(125)基于比较对所接收的声音数据帧进行分段。