会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Multi-stage speaker adaptation
    • 多级扬声器适配
    • US08996366B2
    • 2015-03-31
    • US14181908
    • 2014-02-17
    • Google Inc.
    • Petar AleksicXin Lei
    • G10L15/07G10L17/00G10L15/065
    • G10L17/00G10L15/065G10L15/07
    • A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.
    • 可以基于对应于第一输入语音单元的第一组特征向量的特征来选择第一个具体的性别的说话者自适应技术。 可以将第一组特征向量配置为用于第一输入语音单元的自动语音识别(ASR)。 可以基于第一性别特异性说话者适应技术来修改对应于第二输入语音单元的第二组特征向量。 经修改的第二组特征向量可以被配置为在第二输入语音单元的ASR中使用。 可以基于第二组特征向量的特征来选择第一说话者相关的说话者自适应技术。 可以基于第一说话者相关的说话人适应技术来修改对应于第三单位语音的第三组特征向量。
    • 3. 发明授权
    • Online incremental adaptation of deep neural networks using auxiliary Gaussian mixture models in speech recognition
    • 在语音识别中使用辅助高斯混合模型的深层神经网络的在线增量适应
    • US09466292B1
    • 2016-10-11
    • US13886620
    • 2013-05-03
    • Google Inc.
    • Xin LeiPetar Aleksic
    • G10L15/00G10L15/16
    • G10L15/16G10L15/07G10L15/14
    • Methods and systems for online incremental adaptation of neural networks using Gaussian mixture models in speech recognition are described. In an example, a computing device may be configured to receive an audio signal and a subsequent audio signal, both signals having speech content. The computing device may be configured to apply a speaker-specific feature transform to the audio signal to obtain a transformed audio signal. The speaker-specific feature transform may be configured to include speaker-specific speech characteristics of a speaker-profile relating to the speech content. Further, the computing device may be configured to process the transformed audio signal using a neural network trained to estimate a respective speech content of the audio signal. Based on outputs of the neural network, the computing device may be configured to modify the speaker-specific feature transform, and apply the modified speaker-specific feature transform to a subsequent audio signal.
    • 描述了在语音识别中使用高斯混合模型的神经网络在线增量适应的方法和系统。 在一个示例中,计算设备可以被配置为接收具有语音内容的两个信号的音频信号和后续音频信号。 计算设备可以被配置为将音频特征变换应用于音频信号以获得经变换的音频信号。 特定于扬声器的特征变换可以被配置为包括与语音内容相关的扬声器简档的特定于说话者的语音特征。 此外,计算设备可以被配置为使用被训练来估计音频信号的相应语音内容的神经网络来处理变换的音频信号。 基于所述神经网络的输出,所述计算装置可以被配置为修改所述特定于扬声器的特征变换,并且将所述修改的说话者专有特征变换应用于后续音频信号。
    • 4. 发明申请
    • Multi-Stage Speaker Adaptation
    • 多级扬声器适应
    • US20140025378A1
    • 2014-01-23
    • US14035499
    • 2013-09-24
    • Google Inc.
    • Petar AleksicXin Lei
    • G10L17/00
    • G10L17/00G10L15/065G10L15/07
    • A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.
    • 可以基于对应于第一输入语音单元的第一组特征向量的特征来选择第一个具体的性别的说话者自适应技术。 可以将第一组特征向量配置为用于第一输入语音单元的自动语音识别(ASR)。 可以基于第一性别特异性说话者适应技术来修改对应于第二输入语音单元的第二组特征向量。 经修改的第二组特征向量可以被配置为在第二输入语音单元的ASR中使用。 可以基于第二组特征向量的特征来选择第一说话者相关的说话者自适应技术。 可以基于第一说话者相关的说话人适应技术来修改对应于第三单位语音的第三组特征向量。
    • 5. 发明授权
    • Speech recognition parameter adjustment
    • 语音识别参数调整
    • US08600746B1
    • 2013-12-03
    • US13649747
    • 2012-10-11
    • Google Inc.
    • Xin LeiPatrick An Nguyen
    • G10L15/26G10L15/00
    • G10L15/22G10L15/30G10L2015/226
    • Audio data that encodes an utterance of a user is received. It is determined that the user has been classified as a novice user of a speech recognizer. A speech recognizer setting is selected that is used by the speech recognizer in generating a transcription of the utterance. The selected speech recognizer setting is different than a default speech recognizer setting that is used by the speech recognizer in generating transcriptions of utterances of users that are not classified as novice users. The selected speech recognizer setting results in increased speech recognition accuracy in comparison with the default setting. A transcription of the utterance is obtained that is generated by the speech recognizer using the selected setting.
    • 接收到编码用户话语的音频数据。 确定用户已经被分类为语音识别器的新手用户。 选择语音识别器设置,其由语音识别器用于产生话语的转录。 所选择的语音识别器设置不同于语音识别器在生成未分类为新手用户的用户的话语的转录中使用的默认语音识别器设置。 与默认设置相比,所选择的语音识别器设置导致语音识别精度提高。 获得由语音识别器使用所选择的设置产生的话语的转录。
    • 7. 发明授权
    • Localized speech recognition with offload
    • 本地语音识别与卸载
    • US08880398B1
    • 2014-11-04
    • US13746039
    • 2013-01-21
    • Google Inc.
    • Petar AleksicXin Lei
    • G10L15/00G10L21/00
    • G10L21/00G10L15/07G10L15/30G10L2015/223
    • A local computing device may receive an utterance from a user device. In response to receiving the utterance, the local computing device may obtain a text string transcription of the utterance, and determine a response mode for the utterance. If the response mode is a text-based mode, the local computing device may provide the text string transcription to a target device. If the response mode is a non-text-based mode, the local computing device may convert the text string transcription into one or more commands from a command set supported by the target device, and provide the one or more commands to the target device.
    • 本地计算设备可以从用户设备接收话语。 响应于接收到话语,本地计算设备可以获得话音的文本串转录,并且确定话语的响应模式。 如果响应模式是基于文本的模式,则本地计算设备可以将文本串转录提供给目标设备。 如果响应模式是非基于文本的模式,则本地计算设备可以将文本串转录转换为来自目标设备支持的命令集的一个或多个命令,并将一个或多个命令提供给目标设备。
    • 9. 发明授权
    • Localized speech recognition with offload
    • 本地语音识别与卸载
    • US08554559B1
    • 2013-10-08
    • US13746115
    • 2013-01-21
    • Google Inc.
    • Petar AleksicXin Lei
    • G10L15/00
    • G10L21/00G10L15/07G10L15/30G10L2015/223
    • A local computing device may receive an utterance from a user device. In response to receiving the utterance, the local computing device may obtain a text string transcription of the utterance, and determine a response mode for the utterance. If the response mode is a text-based mode, the local computing device may provide the text string transcription to a target device. If the response mode is a non-text-based mode, the local computing device may convert the text string transcription into one or more commands from a command set supported by the target device, and provide the one or more commands to the target device.
    • 本地计算设备可以从用户设备接收话音。 响应于接收到话语,本地计算设备可以获得话音的文本串转录,并且确定话语的响应模式。 如果响应模式是基于文本的模式,则本地计算设备可以将文本串转录提供给目标设备。 如果响应模式是非基于文本的模式,则本地计算设备可以将文本串转录转换为来自目标设备支持的命令集的一个或多个命令,并将一个或多个命令提供给目标设备。
    • 10. 发明授权
    • Realtime acoustic adaptation using stability measures
    • 使用稳定性措施实时声学适应
    • US08515750B1
    • 2013-08-20
    • US13622576
    • 2012-09-19
    • Google Inc.
    • Xin LeiPetar Aleksic
    • G10L15/26
    • G10L17/14G10L15/07G10L15/26
    • Methods, systems, and computer programs encoded on a computer storage medium for real-time acoustic adaptation using stability measures are disclosed. The methods include the actions of receiving a transcription of a first portion of a speech session, wherein the transcription of the first portion of the speech session is generated using a speaker adaptation profile. The actions further include receiving a stability measure for a segment of the transcription and determining that the stability measure for the segment satisfies a threshold. Additionally, the actions include triggering an update of the speaker adaptation profile using the segment, or using a portion of speech data that corresponds to the segment. And the actions include receiving a transcription of a second portion of the speech session, wherein the transcription of the second portion of the speech session is generated using the updated speaker adaptation profile.
    • 公开了在计算机存储介质上编码的用于使用稳定性度量的实时声学适应的方法,系统和计算机程序。 所述方法包括接收语音会话的第一部分的转录的动作,其中使用说话者适配简档生成语音会话的第一部分的转录。 所述动作还包括接收转录片段的稳定性度量,并确定片段的稳定性度量满足阈值。 此外,动作包括使用该段触发对说话者适配简档的更新,或者使用对应于片段的语音数据的一部分。 并且所述动作包括接收所述语音会话的第二部分的转录,其中使用所述更新的说话者适应简档来生成所述语音会话的所述第二部分的转录。