专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20120014537A1 System and Method for Automatic Microphone Volume Setting 有权
标题翻译：自动麦克风音量设置的系统和方法
公开(公告)号：US20120014537A1
公开(公告)日：2012-01-19
申请号：US12835440
申请日：2010-07-13
申请人： Chang-Qing Shu , Dezhi Liao
发明人： Chang-Qing Shu , Dezhi Liao
IPC分类号： H03G3/00
CPC分类号： H03G3/24 , H03G7/002 , H03G7/007
摘要： Optimal microphone volumes are automatically set for computer applications based on determination of peak volume levels and noise levels from one or more digital audio captures. The peak volume levels and noise levels can be advantageously determined based on distribution curves of sample volume levels in the digital audio captures. Clipping can be automatically compensated for by estimating peak unclipped capture volume levels from the distribution curves.
摘要翻译：基于确定来自一个或多个数字音频捕获的峰值音量水平和噪声水平，自动为计算机应用设置最佳麦克风音量。可以根据数字音频捕获中的采样音量水平的分布曲线有利地确定峰值音量水平和噪声水平。可以通过从分布曲线估计峰值未剪切捕获体积水平来自动补偿剪切。

2. 发明授权

US08559656B2 System and method for automatic microphone volume setting 有权
标题翻译：自动麦克风音量设置的系统和方法
公开(公告)号：US08559656B2
公开(公告)日：2013-10-15
申请号：US12835440
申请日：2010-07-13
申请人： Chang-Qing Shu , Dezhi Liao
发明人： Chang-Qing Shu , Dezhi Liao
IPC分类号： H03G3/00 , H04M9/08 , H04L27/08
CPC分类号： H03G3/24 , H03G7/002 , H03G7/007
摘要： Optimal microphone volumes are automatically set for computer applications based on determination of peak volume levels and noise levels from one or more digital audio captures. The peak volume levels and noise levels can be advantageously determined based on distribution curves of sample volume levels in the digital audio captures. Clipping can be automatically compensated for by estimating peak unclipped capture volume levels from the distribution curves.
摘要翻译：基于确定来自一个或多个数字音频捕获的峰值音量水平和噪声水平，自动为计算机应用设置最佳麦克风音量。可以根据数字音频捕获中的采样音量水平的分布曲线有利地确定峰值音量水平和噪声水平。可以通过从分布曲线估计峰值未剪切捕获体积水平来自动补偿剪切。

3. 发明申请

US20100145677A1 System and Method for Making a User Dependent Language Model 审中-公开
标题翻译：制作用户依赖语言模型的系统和方法
公开(公告)号：US20100145677A1
公开(公告)日：2010-06-10
申请号：US12396933
申请日：2009-03-03
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G06F17/27 , G06F17/30
CPC分类号： G06F16/313
摘要： A language model for a speech recognition engine is made based on user-viewed data files. The data files are reviewed and texts are extracted therefrom. The language model is generated based on the extracted texts. Transcriptions of previous user statements are not required. Different weighting factors can be applied to elements of the extracted texts based on the nature of the data files. The weighting factors are then considered during generation of the language model. A user dependent and application independent language model can be created prior to initial use of the speech recognition engine.
摘要翻译：语音识别引擎的语言模型基于用户观看的数据文件进行。审查数据文件并从中提取文本。语言模型基于提取的文本生成。以前的用户语句的转录不是必需的。基于数据文件的性质，可以将不同的加权因子应用于提取文本的元素。然后在语言模型的生成期间考虑加权因子。可以在语音识别引擎的初始使用之前创建用户依赖的和与应用无关的语言模型。

4. 发明授权

US09478218B2 Using word confidence score, insertion and substitution thresholds for selected words in speech recognition 有权
标题翻译：在语音识别中使用单词置信分数，所选词的插入和替换阈值
公开(公告)号：US09478218B2
公开(公告)日：2016-10-25
申请号：US12258093
申请日：2008-10-24
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00 , G10L15/187
CPC分类号： G10L15/01 , G10L15/187 , G10L25/51
摘要： A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
摘要翻译：介绍了一种使用单词置信分数（WCS）处理来提高语音识别系统精度的方法和系统。选择解码器中的参数以使加权的总错误率最小化，使得删除错误的加权比替代和插入错误更重。 WCS中的出现分布取决于单词是否正确识别，并根据错误类型而不同。这用于确定WCS中插入和替换错误的阈值。通过处理假设词（HYP）（解码器的输出），确定mHYP（改进的HYP）。在某些情况下，根据WCS关于插入和替换阈值的值，将mHYP设置为等于null，替代HYP或HYP。

5. 发明申请

US20100106505A1 USING WORD CONFIDENCE SCORE, INSERTION AND SUBSTITUTION THRESHOLDS FOR SELECTED WORDS IN SPEECH RECOGNITION 有权
标题翻译：在语音识别中使用字信心，插入和替代选择语言
公开(公告)号：US20100106505A1
公开(公告)日：2010-04-29
申请号：US12258093
申请日：2008-10-24
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/01 , G10L15/187 , G10L25/51
摘要： A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
摘要翻译：介绍了一种使用单词置信分数（WCS）处理来提高语音识别系统精度的方法和系统。选择解码器中的参数以使加权的总错误率最小化，使得删除错误的加权比替代和插入错误更重。 WCS中的出现分布取决于单词是否正确识别，并根据错误的类型而不同。这用于确定WCS中插入和替换错误的阈值。通过处理假设词（HYP）（解码器的输出），确定mHYP（改进的HYP）。在某些情况下，根据WCS关于插入和替换阈值的值，将mHYP设置为等于null，替代HYP或HYP。

6. 发明授权

US08301446B2 System and method for training an acoustic model with reduced feature space variation 有权
标题翻译：用于训练具有减小的特征空间变化的声学模型的系统和方法
公开(公告)号：US08301446B2
公开(公告)日：2012-10-30
申请号：US12413896
申请日：2009-03-30
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/187 , G10L15/063 , G10L2015/025
摘要： Feature space variation associated with specific text elements is reduced by training an acoustic model with a phoneme set, dictionary and transcription set configured to better distinguish the specific text elements and at least some specific phonemes associated therewith. The specific text elements can include the most frequently occurring text elements from a text data set, which can include text data beyond the transcriptions of a training data set. The specific text elements can be identified using a text element distribution table sorted by occurrence within the text data set. Specific phonemes can be limited to consonant phonemes to improve speed and accuracy.
摘要翻译：通过用配置为更好地区分特定文本元素和与其相关联的至少一些特定音素的音素集合，字典和转录集来训练声学模型来减少与特定文本元素相关联的特征空间变化。特定文本元素可以包括来自文本数据集的最常出现的文本元素，其可以包括超出训练数据集的转录的文本数据。可以使用由文本数据集中的出现排序的文本元素分布表来识别特定文本元素。特定的音素可以限于辅音音素，以提高速度和准确性。

7. 发明授权

US07680662B2 Systems and methods for implementing segmentation in speech recognition systems 失效
标题翻译：在语音识别系统中实现分割的系统和方法
公开(公告)号：US07680662B2
公开(公告)日：2010-03-16
申请号：US11129254
申请日：2005-05-13
申请人： Chang-Qing Shu , Han Shu
发明人： Chang-Qing Shu , Han Shu
IPC分类号： G10L15/00
CPC分类号： G10L15/142 , G10L15/04 , G10L2015/025
摘要： A speech recognition system (105) includes an acoustic front end (115) and a processing unit (125). The acoustic front end (115) receives frames of acoustic data and determines cepstral coefficients for each of the received frames. The processing unit (125) determines a number of peaks in the cepstral coefficients for each of the received frames of acoustic data and compares the peaks in the cepstral coefficients of a first one of the received frames with the peaks in the cepstral coefficients of at least a second one of the received frames. The processing unit (125) then segments the received frames of acoustic data based on the comparison.
摘要翻译：语音识别系统（105）包括声前端（115）和处理单元（125）。声学前端（115）接收声学数据帧，并确定每个接收帧的倒谱系数。处理单元（125）确定每个接收的声学数据帧的倒谱系数中的峰值数量，并将接收到的帧中的第一个的倒谱系数中的峰值与至少在倒频谱系数中的峰值进行比较接收帧中的第二个。然后，处理单元（125）基于比较对所接收的声音数据帧进行分段。

8. 发明授权

US09659559B2 Phonetic distance measurement system and related methods 有权
公开(公告)号：US09659559B2
公开(公告)日：2017-05-23
申请号：US12491769
申请日：2009-06-25
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/01 , G10L15/22 , G10L15/14 , G10L15/187 , G10L15/06 , G06F17/28
CPC分类号： G10L15/01 , G06F17/2881 , G10L15/063 , G10L15/142 , G10L15/144 , G10L15/187 , G10L15/22 , G10L2015/025 , G10L2015/221
摘要： Phonetic distances are empirically measured as a function of speech recognition engine recognition error rates. The error rates are determined by comparing a recognized speech file with a reference file. The phonetic distances can be normalized to earlier measurements. The phonetic distances/error rates can also be used to improve speech recognition engine grammar selection, as an aid in language training and evaluation, and in other applications.

9. 发明申请

US20120046946A1 SYSTEM AND METHOD FOR MERGING AUDIO DATA STREAMS FOR USE IN SPEECH RECOGNITION APPLICATIONS 有权
标题翻译：用于语音识别应用的用于合并音频数据流的系统和方法
公开(公告)号：US20120046946A1
公开(公告)日：2012-02-23
申请号：US12860245
申请日：2010-08-20
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/06
CPC分类号： G10L15/02 , G10L2021/02165
摘要： A system and method for merging audio data streams receive audio data streams from separate inputs, independently transform each data stream from the time to the frequency domain, and generate separate feature data sets for the transformed data streams. Feature data from each of the separate feature data sets is selected to form a merged feature data set that is output to a decoder for recognition purposes. The separate inputs can include an ear microphone and a mouth microphone.
摘要翻译：用于合并音频数据流的系统和方法从单独的输入接收音频数据流，独立地将每个数据流从时间转换到频域，并为变换的数据流生成单独的特征数据集。选择来自每个单独特征数据集的特征数据以形成被合并的特征数据集，其被输出到解码器用于识别目的。单独的输入可以包括耳麦克风和口麦克风。

10. 发明申请

US20100332230A1 PHONETIC DISTANCE MEASUREMENT SYSTEM AND RELATED METHODS 有权
标题翻译：电话距离测量系统及相关方法
公开(公告)号：US20100332230A1
公开(公告)日：2010-12-30
申请号：US12491769
申请日：2009-06-25
申请人： Chang-Qing Shu
发明人： Chang-Qing Shu
IPC分类号： G10L15/04 , G09B19/06
CPC分类号： G10L15/01 , G06F17/2881 , G10L15/063 , G10L15/142 , G10L15/144 , G10L15/187 , G10L15/22 , G10L2015/025 , G10L2015/221
摘要： Phonetic distances are empirically measured as a function of speech recognition engine recognition error rates. The error rates are determined by comparing a recognized speech file with a reference file. The phonetic distances can be normalized to earlier measurements. The phonetic distances/error rates can also be used to improve speech recognition engine grammar selection, as an aid in language training and evaluation, and in other applications.
摘要翻译：语音距离经验测量为语音识别引擎识别错误率的函数。通过将识别的语音文件与参考文件进行比较来确定错误率。语音距离可以标准化为较早的测量。语音距离/错误率也可用于改进语音识别引擎语法选择，作为语言培训和评估的辅助以及其他应用。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式