会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • LOW-FOOTPRINT ADAPTATION AND PERSONALIZATION FOR A DEEP NEURAL NETWORK
    • 用于深层神经网络的低自适应和个性化
    • WO2015134294A1
    • 2015-09-11
    • PCT/US2015/017872
    • 2015-02-27
    • MICROSOFT TECHNOLOGY LICENSING, LLC
    • XUE, JianLI, JinyuYU, DongSELTZER, Michael L.GONG, Yifan
    • G10L15/07G10L15/16
    • G10L15/16G06N3/082G10L15/075
    • The adaptation and personalization of a deep neural network (DNN) model for automatic speech recognition is provided. An utterance which includes speech features for one or more speakers may be received in ASR tasks such as voice search or short message dictation. A decomposition approach may then be applied to an original matrix in the DNN model. In response to applying the decomposition approach, the original matrix may be converted into multiple new matrices which are smaller than the original matrix. A square matrix may then be added to the new matrices. Speaker-specific parameters may then be stored in the square matrix. The DNN model may then be adapted by updating the square matrix. This process may be applied to all of a number of original matrices in the DNN model. The adapted DNN model may include a reduced number of parameters than those received in the original DNN model.
    • 提供了一种用于自动语音识别的深层神经网络(DNN)模型的适应和个性化。 可以在诸如语音搜索或短消息听写的ASR任务中接收包括用于一个或多个扬声器的语音特征的话语。 然后可以将分解方法应用于DNN模型中的原始矩阵。 响应于应用分解方法,原始矩阵可以被转换成小于原始矩阵的多个新矩阵。 然后可以将正方形矩阵添加到新矩阵。 然后可以将扬声器特定参数存储在方阵中。 然后可以通过更新方阵来适应DNN模型。 该过程可以应用于DNN模型中的所有原始矩阵。 适应的DNN模型可以包括与原始DNN模型中接收的参数相比减少的参数数量。
    • 2. 发明申请
    • SUPERVISED ADAPTATION USING CORRECTIVE N-BEST DECODING
    • 使用正确的N-BEST解码进行监控
    • WO00051105A1
    • 2000-08-31
    • PCT/US2000/001838
    • 2000-01-25
    • G10L15/06G10L15/14
    • G10L15/075G10L2015/0635
    • Supervised adaptation speech is supplied to the recognizer (10) and the recognizer generates the N-best transcriptions of the adaptation speech (14). These transcriptions include the one transcription known to be correct, based on a prior knowledge of the adaptation speech, and the remaining transcriptions known to be incorrect. The system applies weights to each transcription (16): a positive weight to the correct transcription and negative weights to the incorrect transcriptions. These weights have the effect of moving the incorrect transcriptions away from the correct one, rendering the recognition system more discriminative for the new speakers speaking characteristics. Weights applied to the incorrect solutions are based on the respective likelihood scores generated by the recognizer. Preferably, the sum of all weights (positive and negative) are a positive number. This ensures that the system will converge.
    • 受监督的适应语音被提供给识别器(10),并且识别器生成适应语音(14)的N个最佳转录。 这些转录包括基于适应言语的先前知识而已知是正确的一个转录,以及已知不正确的剩余转录。 系统对每个转录(16)应用权重:对正确转录的正负重和不正确转录的负权重。 这些权重具有将错误的转录从正确的转录移开的效果,使得识别系统对于新的说话者的特征更具歧视性。 应用于不正确解的权重是基于识别器产生的各自的可能性得分。 优选地,所有权重(正和负)的和是正数。 这样可以确保系统收敛。
    • 3. 发明申请
    • SYSTEM AND METHODS FOR ADAPTING NEURAL NETWORK ACOUSTIC MODELS
    • 适应神经网络声学模型的系统和方法
    • WO2017099936A1
    • 2017-06-15
    • PCT/US2016/061326
    • 2016-11-10
    • NUANCE COMMUNICATIONS, INC.
    • ZHAN, PumingLI, Xinwei
    • G10L15/16G10L15/07
    • G10L15/075G10L15/07G10L15/14G10L15/16G10L17/02
    • Techniques for adapting a trained neural network acoustic model, comprising using at least one computer hardware processor to perform: generating initial speaker information values for a speaker; generating first speech content values from first speech data corresponding to a first utterance spoken by the speaker; processing the first speech content values and the initial speaker information values using the trained neural network acoustic model; recognizing, using automatic speech recognition, the first utterance based, at least in part on results of the processing; generating updated speaker information values using the first speech data and at least one of the initial speaker information values and/or information used to generate the initial speaker information values; and recognizing, based at least in part on the updated speaker information values, a second utterance spoken by the speaker.
    • 包括使用至少一个计算机硬件处理器来执行以下操作:调节训练的神经网络声学模型的技术:生成说话者的初始说话者信息值; 从对应于讲话者讲的第一话语的第一讲话数据中产生第一讲话内容值; 使用训练的神经网络声学模型处理第一语音内容值和初始说话者信息值; 至少部分地基于所述处理的结果,使用自动语音识别来识别所述第一话语; 使用第一语音数据和初始说话人信息值和/或用于生成初始说话人信息值的信息中的至少一个生成更新的说话人信息值; 以及至少部分基于更新后的说话者信息值来识别说话者说出的第二话语。