会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Method for utilizing validity constraints in a speech endpoint detector
    • 用于在语音端点检测器中使用有效性约束的方法
    • US06718302B1
    • 2004-04-06
    • US09482396
    • 2000-01-12
    • Duanpei WuMiyuki TanakaRuxin ChenLex Olorenshaw
    • Duanpei WuMiyuki TanakaRuxin ChenLex Olorenshaw
    • G10L1102
    • G10L25/87
    • A method for utilizing validity constraints in a speech endpoint detector comprises a validity manager that may utilize a pulse width module to validate utterances that include a plurality of energy pulses during a certain time period. The validity manager also may utilize a minimum power module to ensure that speech energy below a pre-determined level is not classified as a valid utterance. In addition the validity manager may use a duration module to ensure that valid utterances fall within a specified duration. Finally, the validity manager may utilize a short-utterance minimum power module to specifically distinguish an utterance of short duration from background noise based on the energy level of the short utterance.
    • 一种用于在语音端点检测器中利用有限约束的方法包括有效性管理器,其可以利用脉冲宽度模块来在特定时间段期间验证包括多个能量脉冲的话语。 有效性管理器还可以利用最小功率模块来确保低于预定电平的语音能量不被分类为有效的话语。 此外,有效性管理器可以使用持续时间模块来确保有效的话语落在指定的持续时间内。 最后,有效性管理器可以利用短话语最小功率模块来基于短语的能量级别来特别地区分短时间的短时间与背景噪声的发音。
    • 4. 发明授权
    • System and method for speech verification using a confidence measure
    • 使用置信度测量语音验证的系统和方法
    • US06473735B1
    • 2002-10-29
    • US09553985
    • 2000-04-20
    • Duanpei WuXavier Menendez-PidalLex OlorenshawRuxin Chen
    • Duanpei WuXavier Menendez-PidalLex OlorenshawRuxin Chen
    • G10L1506
    • G10L15/10G10L2015/085
    • The present invention comprises a system and method for speech verification using a confidence measure that includes a speech verifier which compares a differential score for a recognized word to a predetermined threshold value, where a recognized word is the word model that produced the highest recognition score. In one embodiment, a single threshold is used for each word in a vocabulary. In another embodiment, each word model has an associated threshold, so that a differential score for a recognized word is compared to a unique threshold associated with that word. In a further embodiment, pairs of confused words in the vocabulary are dealt with separately. If a confused word is the recognized word, the speech verifier compares the differential score to a threshold that depends on the word model that produced the next-highest recognition score. Different values for the various thresholds may maximize rejection accuracy or recognition accuracy. A trade-off between rejection accuracy and recognition accuracy may be made by utilizing an intermediate threshold value that is between a minimum threshold value and a maximum threshold value.
    • 本发明包括一种用于使用置信度测量的语音验证的系统和方法,所述置信度测量包括将识别的词的差分得分与预定阈值进行比较的语音验证器,其中识别词是产生最高识别分数的单词模型。 在一个实施例中,词汇中的每个单词使用单个阈值。 在另一个实施例中,每个单词模型具有相关联的阈值,使得将识别的单词的差分分数与与该单词相关联的唯一阈值进行比较。 在另一实施例中,词汇表中的混淆词对被单独处理。 如果一个混淆的单词是被识别的单词,语音验证器将差分分数与取决于产生下一最高识别分数的单词模型的阈值进行比较。 各种阈值的不同值可以最大化拒绝准确度或识别精度。 可以通过利用处于最小阈值和最大阈值之间的中间阈值来进行拒绝准确度和识别精度之间的折衷。
    • 6. 发明授权
    • Method and apparatus for a parameter sharing speech recognition system
    • 一种参数共享语音识别系统的方法和装置
    • US6006186A
    • 1999-12-21
    • US953026
    • 1997-10-16
    • Ruxin ChenMiyuki TanakaDuanpei WuLex S. Olorenshaw
    • Ruxin ChenMiyuki TanakaDuanpei WuLex S. Olorenshaw
    • G10L15/14G10L15/18G10L7/08
    • G10L15/142G10L15/148
    • A method and an apparatus for a parameter sharing speech recognition system are provided. Speech signals are received into a processor of a speech recognition system. The speech signals are processed using a speech recognition system hosting a shared hidden Markov model (HMM) produced by generating a number of phoneme models, some of which are shared. The phoneme models are generated by retaining as a separate phoneme model any triphone model having a number of trained frames available that exceeds a prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having a common biphone exceed the prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having an equivalent effect on a phonemic context exceed the prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models having the same center context. The generated phoneme models are trained, and shared phoneme model states are generated that are shared among the phoneme models. Shared probability distribution functions are generated that are shared among the phoneme model states. Shared probability sub-distribution functions are generated that are shared among the phoneme model probability distribution functions. The shared phoneme model hierarchy is reevaluated for further sharing in response to the shared probability sub-distribution functions. Signals representative of the received speech signals are generated.
    • 提供了一种用于参数共享语音识别系统的方法和装置。 语音信号被接收到语音识别系统的处理器中。 语音信号使用一个语音识别系统进行处理,该语音识别系统承载通过生成许多音素模型而产生的共享隐马尔可夫模型(HMM),其中一些是共享的。 音素模型是通过保留作为单独音素模型的任何具有超过预定阈值的已训练帧数的三音模型而产生的。 生成共享音素模型以表示具有共同biphone的经过训练的帧的数量超过预定阈值的三音节音素模型组中的每一组。 生成共享音素模型以表示三音节音素模型中的每一组,其中对音素上下文具有等效影响的经过训练的帧的数量超过预先指定的阈值。 生成共享音素模型以表示具有相同中心上下文的三音节音素模型组中的每一组。 生成的音素模型被训练,并且生成在音素模型中共享的共享音素模型状态。 生成在音素模型状态之间共享的共享概率分布函数。 生成在音素模型概率分布函数中共享的共享概率子分布函数。 共享音素模型层次结构被重新评估以响应于共享概率子分布函数进一步共享。 生成表示接收到的语音信号的信号。
    • 8. 发明授权
    • Source separation by independent component analysis in conjunction with source direction information
    • 源分离与独立成分分析结合源方向信息
    • US08880395B2
    • 2014-11-04
    • US13464828
    • 2012-05-04
    • Jaekwon YooRuxin Chen
    • Jaekwon YooRuxin Chen
    • G10L21/00
    • G10L21/0272
    • Methods and apparatus for signal processing are disclosed. Source separation can be performed to extract source signals from mixtures of source signals by way of independent component analysis. Source direction information is utilized in the separation process, and independent component analysis techniques described herein use multivariate probability density functions to preserve the alignment of frequency bins in the source separation process. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
    • 公开了用于信号处理的方法和装置。 可以通过独立分量分析来执行源分离以从源信号的混合中提取源信号。 在分离过程中使用源方向信息,并且本文描述的独立分量分析技术使用多变量概率密度函数来保持源分离过程中频率仓的对准。 要强调的是,该摘要被提供以符合要求抽象的规则,允许搜索者或其他读者快速确定技术公开内容的主题。 提交它的理解是,它不会用于解释或限制权利要求的范围或含义。
    • 9. 发明申请
    • HYBRID PERFORMANCE SCALING OR SPEECH RECOGNITION
    • 混合性能缩放或语音识别
    • US20140237277A1
    • 2014-08-21
    • US13791716
    • 2013-03-08
    • Dominic S. MallinsonRuxin Chen
    • Dominic S. MallinsonRuxin Chen
    • G06F1/32
    • G06F1/3206G06F1/3203G06F1/3231G06F1/3293G06F3/017G10L15/22G10L25/78Y02D10/122Y02D10/173
    • Aspects of the present disclosure describe methods and apparatuses for executing operations on a client device platform that is operating in a low-power state. A first analysis may be used to assign a first confidence score to a recorded non-tactile input. When the first confidence score is above a first threshold an intermediate-power state may be activated. A second more detailed analysis may then assign a second confidence score to the non-tactile input. When the second confidence score is above a second threshold, then the operation is initiated. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
    • 本公开的方面描述了在以低功率状态运行的客户端设备平台上执行操作的方法和装置。 可以使用第一分析来将第一置信度分数分配给记录的非触觉输入。 当第一置信度高于第一阈值时,可以激活中间功率状态。 然后,第二更详细的分析可以向非触觉输入分配第二置信度分数。 当第二置信度分数高于第二阈值时,开始该操作。 要强调的是,该摘要被提供以符合要求抽象的规则,允许搜索者或其他读者快速确定技术公开内容的主题。 提交它的理解是,它不会用于解释或限制权利要求的范围或含义。
    • 10. 发明授权
    • Speech syllable/vowel/phone boundary detection using auditory attention cues
    • 语音音节/元音/电话边界检测使用听觉注意线索
    • US08756061B2
    • 2014-06-17
    • US13078866
    • 2011-04-01
    • Ozlem KalinliRuxin Chen
    • Ozlem KalinliRuxin Chen
    • G10L15/04
    • G10L15/05G10L15/04G10L15/16G10L15/24G10L15/34G10L25/03
    • In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.
    • 在语音期间的音节或元音或电话边界检测中,可以为输入声音窗口确定听觉频谱,并且可以从听觉谱中提取一个或多个多尺度特征。 可以使用单独的二维光谱接收滤波器来提取每个多尺度特征。 可以生成与一个或多个多尺度特征相对应的一个或多个特征图,并且可以从一个或多个特征图中的每一个提取听觉要点矢量。 可以通过增加从一个或多个特征图提取的每个听觉要素矢量来获得累积的要点向量。 通过使用机器学习算法将累积的要点向量映射到一个或多个音节或元音或电话边界特征,可以检测声音的输入窗口中的一个或多个音节或元音或电话边界。