会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Frame-level combination of deep neural network and gaussian mixture models
    • 深层神经网络和高斯混合模型的帧级组合
    • US09240184B1
    • 2016-01-19
    • US13765002
    • 2013-02-12
    • Hui LinXin LeiVincent Vanhoucke
    • Hui LinXin LeiVincent Vanhoucke
    • G10L15/22
    • G10L15/22G10L15/142
    • A method and system for frame-level merging of HMM state predictions determined by different techniques is disclosed. An audio input signal may be transformed into a first and second sequence of feature vector, the sequences corresponding to each other and to a temporal sequence of frames of the audio input signal on a frame-by-frame basis. The first sequence may be processed by a neural network (NN) to determine NN-based state predictions, and the second sequence may be processed by a Gaussian mixture model (GMM) to determine GMM-based state predictions. The NN-based and GMM-based state predictions may be merged as weighted sums for each of a plurality of HMM state on a frame-by-frame basis to determine merged state predictions. The merged state predictions may then be applied to the HMMs to speech content of the audio input signal.
    • 公开了通过不同技术确定的HMM状态预测的帧级合并的方法和系统。 音频输入信号可以被变换为第一和第二特征向量序列,这些序列彼此对应,并且逐帧地基于音频输入信号的帧的时间序列。 第一序列可以由神经网络(NN)处理以确定基于NN的状态预测,并且第二序列可以由高斯混合模型(GMM)来处理,以确定基于GMM的状态预测。 基于NN和GMM的状态预测可以逐帧合并为多个HMM状态中的每一个的加权和,以确定合并的状态预测。 然后可以将合并状态预测应用于HMM到音频输入信号的语音内容。
    • 2. 发明授权
    • Mobile speech recognition with explicit tone features
    • 具有明确色调特征的移动语音识别
    • US08725498B1
    • 2014-05-13
    • US13556856
    • 2012-07-24
    • Yun-hsuan SungMeihong WangXin Lei
    • Yun-hsuan SungMeihong WangXin Lei
    • G10L15/02
    • G10L15/02G10L25/48G10L25/90
    • A computer-implemented method for digital speech processing, including (1) receiving, at a server computer, digital speech data from a computing device, the digital speech data comprising data points sampled at respective time points; (2) computing, by the server computer, a tonal feature of the digital speech data, the tonal feature comprising information encoding fundamental frequencies at the respective time points; (3) computing, by the server computer, a logarithm of the tonal feature at the respective time points; and (4) processing, by the server computer, the logarithm of the tonal feature based on a characterization of the digital speech data at the respective time points.
    • 一种用于数字语音处理的计算机实现方法,包括(1)在服务器计算机处接收来自计算设备的数字语音数据,所述数字语音数据包括在各个时间点采样的数据点; (2)由服务器计算机计算数字语音数据的音调特征,该音调特征包括在各个时间点编码基本频率的信息; (3)由服务器计算机计算各个时间点的色调特征的对数; 和(4)由服务器计算机根据在各个时间点的数字语音数据的表征来处理音调特征的对数。