专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US09478218B2 Using word confidence score, insertion and substitution thresholds for selected words in speech recognition 有权
标题翻译：在语音识别中使用单词置信分数，所选词的插入和替换阈值
公开(公告)号：US09478218B2
公开(公告)日：2016-10-25
申请号：US12258093
申请日：2008-10-24
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00 , G10L15/187
CPC分类号： G10L15/01 , G10L15/187 , G10L25/51
摘要： A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
摘要翻译：介绍了一种使用单词置信分数（WCS）处理来提高语音识别系统精度的方法和系统。选择解码器中的参数以使加权的总错误率最小化，使得删除错误的加权比替代和插入错误更重。 WCS中的出现分布取决于单词是否正确识别，并根据错误类型而不同。这用于确定WCS中插入和替换错误的阈值。通过处理假设词（HYP）（解码器的输出），确定mHYP（改进的HYP）。在某些情况下，根据WCS关于插入和替换阈值的值，将mHYP设置为等于null，替代HYP或HYP。

2. 发明申请

US20120014537A1 System and Method for Automatic Microphone Volume Setting 有权
标题翻译：自动麦克风音量设置的系统和方法
公开(公告)号：US20120014537A1
公开(公告)日：2012-01-19
申请号：US12835440
申请日：2010-07-13
申请人： Chang-Qing Shu , Dezhi Liao
发明人： Chang-Qing Shu , Dezhi Liao
IPC分类号： H03G3/00
CPC分类号： H03G3/24 , H03G7/002 , H03G7/007
摘要： Optimal microphone volumes are automatically set for computer applications based on determination of peak volume levels and noise levels from one or more digital audio captures. The peak volume levels and noise levels can be advantageously determined based on distribution curves of sample volume levels in the digital audio captures. Clipping can be automatically compensated for by estimating peak unclipped capture volume levels from the distribution curves.
摘要翻译：基于确定来自一个或多个数字音频捕获的峰值音量水平和噪声水平，自动为计算机应用设置最佳麦克风音量。可以根据数字音频捕获中的采样音量水平的分布曲线有利地确定峰值音量水平和噪声水平。可以通过从分布曲线估计峰值未剪切捕获体积水平来自动补偿剪切。

3. 发明申请

US20100106505A1 USING WORD CONFIDENCE SCORE, INSERTION AND SUBSTITUTION THRESHOLDS FOR SELECTED WORDS IN SPEECH RECOGNITION 有权
标题翻译：在语音识别中使用字信心，插入和替代选择语言
公开(公告)号：US20100106505A1
公开(公告)日：2010-04-29
申请号：US12258093
申请日：2008-10-24
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/01 , G10L15/187 , G10L25/51
摘要： A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
摘要翻译：介绍了一种使用单词置信分数（WCS）处理来提高语音识别系统精度的方法和系统。选择解码器中的参数以使加权的总错误率最小化，使得删除错误的加权比替代和插入错误更重。 WCS中的出现分布取决于单词是否正确识别，并根据错误的类型而不同。这用于确定WCS中插入和替换错误的阈值。通过处理假设词（HYP）（解码器的输出），确定mHYP（改进的HYP）。在某些情况下，根据WCS关于插入和替换阈值的值，将mHYP设置为等于null，替代HYP或HYP。

4. 发明授权

US08301446B2 System and method for training an acoustic model with reduced feature space variation 有权
标题翻译：用于训练具有减小的特征空间变化的声学模型的系统和方法
公开(公告)号：US08301446B2
公开(公告)日：2012-10-30
申请号：US12413896
申请日：2009-03-30
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/187 , G10L15/063 , G10L2015/025
摘要： Feature space variation associated with specific text elements is reduced by training an acoustic model with a phoneme set, dictionary and transcription set configured to better distinguish the specific text elements and at least some specific phonemes associated therewith. The specific text elements can include the most frequently occurring text elements from a text data set, which can include text data beyond the transcriptions of a training data set. The specific text elements can be identified using a text element distribution table sorted by occurrence within the text data set. Specific phonemes can be limited to consonant phonemes to improve speed and accuracy.
摘要翻译：通过用配置为更好地区分特定文本元素和与其相关联的至少一些特定音素的音素集合，字典和转录集来训练声学模型来减少与特定文本元素相关联的特征空间变化。特定文本元素可以包括来自文本数据集的最常出现的文本元素，其可以包括超出训练数据集的转录的文本数据。可以使用由文本数据集中的出现排序的文本元素分布表来识别特定文本元素。特定的音素可以限于辅音音素，以提高速度和准确性。

5. 发明授权

US07680662B2 Systems and methods for implementing segmentation in speech recognition systems 失效
标题翻译：在语音识别系统中实现分割的系统和方法
公开(公告)号：US07680662B2
公开(公告)日：2010-03-16
申请号：US11129254
申请日：2005-05-13
申请人： Chang-Qing Shu , Han Shu
发明人： Chang-Qing Shu , Han Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/142 , G10L15/04 , G10L2015/025
摘要： A speech recognition system (105) includes an acoustic front end (115) and a processing unit (125). The acoustic front end (115) receives frames of acoustic data and determines cepstral coefficients for each of the received frames. The processing unit (125) determines a number of peaks in the cepstral coefficients for each of the received frames of acoustic data and compares the peaks in the cepstral coefficients of a first one of the received frames with the peaks in the cepstral coefficients of at least a second one of the received frames. The processing unit (125) then segments the received frames of acoustic data based on the comparison.
摘要翻译：语音识别系统（105）包括声前端（115）和处理单元（125）。声学前端（115）接收声学数据帧，并确定每个接收帧的倒谱系数。处理单元（125）确定每个接收的声学数据帧的倒谱系数中的峰值数量，并将接收到的帧中的第一个的倒谱系数中的峰值与至少在倒频谱系数中的峰值进行比较接收帧中的第二个。然后，处理单元（125）基于比较对所接收的声音数据帧进行分段。

6. 发明授权

US08731923B2 System and method for merging audio data streams for use in speech recognition applications 有权
标题翻译：用于合并用于语音识别应用的音频数据流的系统和方法
公开(公告)号：US08731923B2
公开(公告)日：2014-05-20
申请号：US12860245
申请日：2010-08-20
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/02 , G10L2021/02165
摘要： A system and method for merging audio data streams receive audio data streams from separate inputs, independently transform each data stream from the time to the frequency domain, and generate separate feature data sets for the transformed data streams. Feature data from each of the separate feature data sets is selected to form a merged feature data set that is output to a decoder for recognition purposes. The separate inputs can include an ear microphone and a mouth microphone.
摘要翻译：用于合并音频数据流的系统和方法从单独的输入接收音频数据流，独立地将每个数据流从时间转换到频域，并为变换的数据流生成单独的特征数据集。选择来自每个单独特征数据集的特征数据以形成被合并的特征数据集，其被输出到解码器用于识别目的。单独的输入可以包括耳麦克风和口麦克风。

7. 发明授权

US08515734B2 Integrated language model, related systems and methods 有权
标题翻译：综合语言模型，相关系统和方法
公开(公告)号：US08515734B2
公开(公告)日：2013-08-20
申请号：US12701788
申请日：2010-02-08
申请人： Chang-Qing Shu , Han Shu , John M. Mervin
发明人： Chang-Qing Shu , Han Shu , John M. Mervin
IPC分类号： G06F17/27 , G10L15/00
CPC分类号： G10L15/18 , G06F17/2775 , G10L15/183
摘要： An integrated language model includes an upper-level language model component and a lower-level language model component, with the upper-level language model component including a non-terminal and the lower-level language model component being applied to the non-terminal. The upper-level and lower-level language model components can be of the same or different language model formats, including finite state grammar (FSG) and statistical language model (SLM) formats. Systems and methods for making integrated language models allow designation of language model formats for the upper-level and lower-level components and identification of non-terminals. Automatic non-terminal replacement and retention criteria can be used to facilitate the generation of one or both language model components, which can include the modification of existing language models.
摘要翻译：综合语言模型包括上级语言模型组件和较低级语言模型组件，上级语言模型组件包括非终端，低级语言模型组件应用于非终端。上级和下级语言模型组件可以具有相同或不同的语言模型格式，包括有限状态语法（FSG）和统计语言模型（SLM）格式。制定集成语言模型的系统和方法允许指定上级和下级组件的语言模型格式以及非终端的识别。可以使用自动非终端替换和保留标准来促进一个或两个语言模型组件的生成，这可以包括对现有语言模型的修改。

8. 发明申请

US20110196668A1 Integrated Language Model, Related Systems and Methods 有权
标题翻译：综合语言模型，相关系统和方法
公开(公告)号：US20110196668A1
公开(公告)日：2011-08-11
申请号：US12701788
申请日：2010-02-08
申请人： Chang-Qing Shu , Han Shu , John M. Mervin
发明人： Chang-Qing Shu , Han Shu , John M. Mervin
IPC分类号： G06F17/27 , G10L15/18
CPC分类号： G10L15/18 , G06F17/2775 , G10L15/183
摘要： An integrated language model includes an upper-level language model component and a lower-level language model component, with the upper-level language model component including a non-terminal and the lower-level language model component being applied to the non-terminal. The upper-level and lower-level language model components can be of the same or different language model formats, including finite state grammar (FSG) and statistical language model (SLM) formats. Systems and methods for making integrated language models allow designation of language model formats for the upper-level and lower-level components and identification of non-terminals. Automatic non-terminal replacement and retention criteria can be used to facilitate the generation of one or both language model components, which can include the modification of existing language models.
摘要翻译：综合语言模型包括上级语言模型组件和较低级语言模型组件，上级语言模型组件包括非终端，低级语言模型组件应用于非终端。上级和下级语言模型组件可以具有相同或不同的语言模型格式，包括有限状态语法（FSG）和统计语言模型（SLM）格式。制定集成语言模型的系统和方法允许指定上级和下级组件的语言模型格式以及非终端的识别。可以使用自动非终端替换和保留标准来促进一个或两个语言模型组件的生成，这可以包括对现有语言模型的修改。

9. 发明申请

US20100250240A1 SYSTEM AND METHOD FOR TRAINING AN ACOUSTIC MODEL WITH REDUCED FEATURE SPACE VARIATION 有权
标题翻译：用于训练具有减少的特征空间变化的声学模型的系统和方法
公开(公告)号：US20100250240A1
公开(公告)日：2010-09-30
申请号：US12413896
申请日：2009-03-30
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G06F17/21
CPC分类号： G10L15/187 , G10L15/063 , G10L2015/025
摘要： Feature space variation associated with specific text elements is reduced by training an acoustic model with a phoneme set, dictionary and transcription set configured to better distinguish the specific text elements and at least some specific phonemes associated therewith. The specific text elements can include the most frequently occurring text elements from a text data set, which can include text data beyond the transcriptions of a training data set. The specific text elements can be identified using a text element distribution table sorted by occurrence within the text data set. Specific phonemes can be limited to consonant phonemes to improve speed and accuracy.
摘要翻译：通过用配置成更好地区分特定文本元素和与其相关联的至少一些特定音素的音素集合，字典和转录集来训练声学模型来减少与特定文本元素相关联的特征空间变化。特定文本元素可以包括来自文本数据集的最常出现的文本元素，其可以包括超出训练数据集的转录的文本数据。可以使用由文本数据集中的出现排序的文本元素分布表来识别特定文本元素。特定的音素可以限于辅音音素，以提高速度和准确性。

10. 发明申请

US20050209851A1 Systems and methods for implementing segmentation in speech recognition systems 失效
公开(公告)号：US20050209851A1
公开(公告)日：2005-09-22
申请号：US11129254
申请日：2005-05-13
申请人： Chang-Qing Shu , Han Shu
发明人： Chang-Qing Shu , Han Shu
IPC分类号： G10L15/00 , G10L15/02 , G10L15/04 , G10L15/14
CPC分类号： G10L15/142 , G10L15/04 , G10L2015/025
摘要： A speech recognition system (105) includes an acoustic front end (115) and a processing unit (125). The acoustic front end (115) receives frames of acoustic data and determines cepstral coefficients for each of the received frames. The processing unit (125) determines a number of peaks in the cepstral coefficients for each of the received frames of acoustic data and compares the peaks in the cepstral coefficients of a first one of the received frames with the peaks in the cepstral coefficients of at least a second one of the received frames. The processing unit (125) then segments the received frames of acoustic data based on the comparison.

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式