专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

91. 发明申请

WO2018097936A1 TRAINED DATA INPUT SYSTEM 审中-公开
公开(公告)号：WO2018097936A1
公开(公告)日：2018-05-31
申请号：PCT/US2017/059145
申请日：2017-10-31
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： WILLSON, Matthew James , ORR, Douglas Alexander Harper , ISO-SIPILÄ, Juha , FISCATO, Marco
IPC分类号： G06F3/023 , G06F3/0488 , G06F17/27
CPC分类号： G06F17/276 , G06F3/0237 , G06F3/04886 , G06F17/2785 , G10L25/30
摘要： A data input system has a processor which receives user input comprising a sequence of one or more items and a language model which computes candidate next items in the sequence using the user input. A training engine trains the language model using data about a plurality of true words which a user intended to input using the data input system, and for each true word, at least one alternative candidate, being a word computed assuming imperfect entry of the true word to the data input system.

92. 发明申请

WO2018085724A1 QUASI-RECURRENT NEURAL NETWORK BASED ENCODER-DECODER MODEL 审中-公开
标题翻译：基于准回归神经网络的编码器 - 解码器模型
公开(公告)号：WO2018085724A1
公开(公告)日：2018-05-11
申请号：PCT/US2017/060051
申请日：2017-11-03
申请人： SALESFORCE.COM, INC.
发明人： BRADBURY, James , MERITY, Stephen, Joseph , XIONG, Caiming , SOCHER, Richard
IPC分类号： G06N3/04 , G10L15/16 , G10L15/18 , G06F17/20 , G10L25/30
CPC分类号： G06N3/08 , G06F17/16 , G06F17/20 , G06F17/2715 , G06F17/2785 , G06F17/2818 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/10 , G10L15/16 , G10L15/18 , G10L15/1815 , G10L25/30
摘要： The technology disclosed provides a quasi-recurrent neural network (QRNN) that alternates convolutional layers, which apply in parallel across timesteps, and minimalist recurrent pooling layers that apply in parallel across feature dimensions.
摘要翻译：所公开的技术提供了准循环神经网络（QRNN），其交替跨时间步并行应用的卷积层和跨特征维并行应用的最小复发池层。

93. 发明申请

WO2018085722A1 QUASI-RECURRENT NEURAL NETWORK 审中-公开
标题翻译：准回归神经网络
公开(公告)号：WO2018085722A1
公开(公告)日：2018-05-11
申请号：PCT/US2017/060049
申请日：2017-11-03
申请人： SALESFORCE.COM, INC.
发明人： BRADBURY, James , MERITY, Stephen, Joseph , XIONG, Caiming , SOCHER, Richard
IPC分类号： G06N3/04 , G10L15/16 , G10L15/18 , G06F17/20 , G10L25/30
CPC分类号： G06N3/08 , G06F17/16 , G06F17/20 , G06F17/2715 , G06F17/2785 , G06F17/2818 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/10 , G10L15/16 , G10L15/18 , G10L15/1815 , G10L25/30
摘要： The technology disclosed provides a quasi-recurrent neural network (QRNN) that alternates convolutional layers, which apply in parallel across timesteps, and minimalist recurrent pooling layers that apply in parallel across feature dimensions.
摘要翻译：所公开的技术提供了准循环神经网络（QRNN），其交替跨时间步并行应用的卷积层和跨特征维并行应用的最小复发池层。

94. 发明申请

WO2018028767A1 DEVICES AND METHODS FOR EVALUATING SPEECH QUALITY 审中-公开
标题翻译：用于评估演讲质量的装置和方法
公开(公告)号：WO2018028767A1
公开(公告)日：2018-02-15
申请号：PCT/EP2016/068966
申请日：2016-08-09
申请人： HUAWEI TECHNOLOGIES CO., LTD. , XIAO, Wei
发明人： XIAO, Wei , HAKAMI, Mona , KLEIJN, Willem Bastiaan
IPC分类号： G10L25/69 , G10L25/30
CPC分类号： G10L25/60 , G06F17/18 , G10L15/02 , G10L25/03 , G10L25/30 , G10L25/69
摘要： The invention relates to an apparatus (200) for determining a quality score (MOS) for an audio signal sample, the apparatus (200) comprising: an extractor (201) configured to extract a feature vector from the audio signal sample, wherein the feature vector comprises a plurality of feature values and wherein each feature value is associated to a different feature of the feature vector; a pre-processor (203) configured to pre-process a feature value of the feature vector based on a cumulative distribution function associated to the feature represented by the feature value to obtain a pre-processed feature value; and a processor (205) configured to implement a neural network and to determine the quality score (MOS) for the audio signal sample based on the pre-processed feature value and a set of neural network parameters for the neural network associated to the cumulative distribution function.
摘要翻译：本发明涉及一种用于确定音频信号样本的质量分数（MOS）的设备（200），该设备（200）包括：提取器（201），其被配置为从特征向量中提取特征向量所述音频信号样本，其中所述特征向量包括多个特征值，并且其中每个特征值与所述特征向量的不同特征相关联; 预处理器，用于基于与所述特征值表征的特征相关联的累积分布函数，对所述特征向量的特征值进行预处理，得到预处理后的特征值; 以及处理器（205），被配置为实现神经网络并且基于预处理的特征值和与该累积分布相关联的神经网络的一组神经网络参数来确定音频信号样本的质量得分（MOS）功能

95. 发明申请

WO2017196929A1 AUDIO PROCESSING WITH NEURAL NETWORKS 审中-公开
标题翻译：音频处理与神经网络
公开(公告)号：WO2017196929A1
公开(公告)日：2017-11-16
申请号：PCT/US2017/031888
申请日：2017-05-10
申请人： GOOGLE LLC
发明人： ROBLEK, Dominik , SHARIFI, Matthew
IPC分类号： G10L25/30
CPC分类号： G06N3/08 , G06F3/16 , G06N3/049 , G06N3/084 , G10L25/30
摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio processing using neural networks. One of the systems includes multiple neural network layers, wherein the neural network system is configured to receive time domain features of an audio sample and to process the time domain features to generate a neural network output for the audio sample, the plurality of neural network layers comprising: a frequency-transform (F-T) layer that is configured to apply a transformation defined by a set of F-T layer parameters that transforms a window of time domain features into frequency domain features; and one or more other neural network layers having respective layer parameters, wherein the one or more neural network layers are configured to process frequency domain features to generate a neural network output.
摘要翻译：包括在计算机存储介质上编码的计算机程序的方法，系统和装置，用于使用神经网络进行音频处理。其中一个系统包括多个神经网络层，其中神经网络系统被配置为接收音频样本的时域特征并且处理时域特征以生成音频样本的神经网络输出，多个神经网络层包括：频率转换（FT）层，被配置为应用由一组FT层参数定义的变换，所述FT层参数将时域特征的窗变换成频域特征; 以及具有相应层参数的一个或多个其他神经网络层，其中所述一个或多个神经网络层被配置为处理频域特征以生成神经网络输出。

96. 发明申请

WO2016063795A1 METHOD FOR TRANSFORMING A NOISY SPEECH SIGNAL TO AN ENHANCED SPEECH SIGNAL 审中-公开
标题翻译：将噪声发音信号转换为增强语音信号的方法
公开(公告)号：WO2016063795A1
公开(公告)日：2016-04-28
申请号：PCT/JP2015/079242
申请日：2015-10-08
申请人： MITSUBISHI ELECTRIC CORPORATION
发明人： ERDOGAN, Hakan , HERSHEY, John , WATANABE, Shinji , LE ROUX, Jonathan
IPC分类号： G10L25/30 , G10L21/0208 , G10L25/03 , G10L21/0324 , G06N3/02
CPC分类号： G10L21/0208 , G10L21/0216 , G10L21/0324 , G10L25/03 , G10L25/30
摘要： A method transforms a noisy speech signal to an enhanced speech signal, by first acquiring the noisy speech signal from an environment. The noisy speech signal is processed by an automatic speech recognition system (ASR) to produce ASR features. The the ASR features and noisy speech spectral features are processed using an enhancement network having network parameters to produce a mask. Then, the mask is applied to the noisy speech signal to obtain the enhanced speech signal.
摘要翻译：一种方法通过首先从环境中获取有噪声的语音信号，将噪声语音信号转换成增强语音信号。噪声语音信号由自动语音识别系统（ASR）处理以产生ASR特征。使用具有网络参数的增强网络来处理ASR特征和噪声语音频谱特征以产生掩模。然后，将掩模应用于噪声语音信号以获得增强的语音信号。

97. 发明申请

WO2016049611A1 NEURAL NETWORK VOICE ACTIVITY DETECTION EMPLOYING RUNNING RANGE NORMALIZATION 审中-公开
标题翻译：神经网络语音活动检测运行范围正常化
公开(公告)号：WO2016049611A1
公开(公告)日：2016-03-31
申请号：PCT/US2015/052519
申请日：2015-09-26
申请人： CYPHER, LLC
发明人： VICKERS, Earl
IPC分类号： G10L15/16 , G10L25/27 , G10L25/78
CPC分类号： G10L21/0264 , G10L21/0224 , G10L25/30 , G10L25/60 , G10L25/78 , G10L25/84 , G10L2015/0636
摘要： A "running range normalization" method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
摘要翻译： “运行范围归一化”方法包括计算对语音活动检测（VAD）有用的特征值的范围的运行估计，并且通过将它们映射到期望的范围来对特征进行归一化。运行范围归一化包括计算VAD特征的最小值和最大值的运行估计值，并通过将原始范围映射到所需范围来对特征值进行归一化。可选地选择平滑系数来定向地偏置最小值和最大值的运行估计中的至少一个的变化率。归一化VAD特征参数用于训练机器学习算法以检测语音活动，并使用经过训练的机器学习算法来隔离或增强音频数据的语音分量。

98. 发明申请

WO2016040885A1 SYSTEMS AND METHODS FOR RESTORATION OF SPEECH COMPONENTS 审中-公开
标题翻译：用于恢复语音组件的系统和方法
公开(公告)号：WO2016040885A1
公开(公告)日：2016-03-17
申请号：PCT/US2015/049816
申请日：2015-09-11
申请人： AUDIENCE, INC.
发明人： AVENDANO, Carlos , WOODRUFF, John
IPC分类号： G10L21/02
CPC分类号： G10L21/02 , G10L21/0208 , G10L21/038 , G10L25/30
摘要： A method for restoring distorted speech components of an audio signal distorted by a noise reduction or a noise cancellation includes determining distorted frequency regions and undistorted frequency regions in the audio signal. The distorted frequency regions include regions of the audio signal in which a speech distortion is present. Iterations are performed using a model to refine predictions of the audio signal at distorted frequency regions. The model is configured to modify the audio signal and may include deep neural network trained using spectral envelopes of clean or undamaged audio signals. Before each iteration, the audio signal at the undistorted frequency regions is restored to values of the audio signal prior to the first iteration; while the audio signal at distorted frequency regions is refined starting from zero at the first iteration. Iterations are ended when discrepancies of audio signal at undistorted frequency regions meet pre-defined criteria.
摘要翻译：用于恢复由噪声降低或噪声消除失真的音频信号的失真语音分量的方法包括确定音频信号中的失真频率区域和未失真的频率区域。失真的频率区域包括其中存在语音失真的音频信号的区域。使用模型执行迭代，以改善在失真频率区域的音频信号的预测。该模型被配置为修改音频信号，并且可以包括使用干净或未损坏音频信号的频谱包络训练的深层神经网络。在每次迭代之前，未失真频率区域处的音频信号在第一次迭代之前恢复为音频信号的值; 而失真频率区域的音频信号在第一次迭代时从零开始精细化。当未失真频率区域的音频信号的差异符合预定义的标准时，迭代结束。

99. 发明申请

WO2015147362A1 차신호 고주파 신호의 비교법에 의한 음주 판별 방법, 이를 수행하기 위한 기록 매체 및 장치 审中-公开
标题翻译：用于通过差分信号中的高频信号比较确定酒精的方法，以及用于实现其的记录介质和装置
公开(公告)号：WO2015147362A1
公开(公告)日：2015-10-01
申请号：PCT/KR2014/002849
申请日：2014-04-02
申请人： 숭실대학교산학협력단 , (주) 지씨에스씨
发明人： 배명진 , 이상길 , 배성근
IPC分类号： G10L25/51 , G10L25/93
CPC分类号： G10L25/66 , A61B5/4803 , A61B5/4845 , A61B5/7257 , A61B5/7264 , A61B7/00 , G10L25/15 , G10L25/21 , G10L25/30 , G10L25/93
摘要： 음주 판별 방법은, 입력된 음성 신호의 유효 프레임을 검출하는 단계; 유효 프레임의 원신호의 차신호를 검출하는 단계; 차신호의 고속 푸리에 변환을 수행하여 주파수 영역으로 변환하는 단계; 고속 푸리에 변환된 차신호의 고주파 성분들을 검출하는 단계; 및 고주파 성분들 간의 기울기 차이에 기초하여 음주 상태를 판단하는 단계를 포함한다. 이에 따라, 원거리에 있는 운전자 또는 운항자의 음주 여부 및 정도를 파악할 수 있으므로, 음주 운전 또는 운항으로 인한 사고를 예방할 수 있다.
摘要翻译：用于确定酒精使用的方法包括以下步骤：检测输入音频信号中的有效帧; 检测有效帧的原始信号中的差分信号; 对要变换为频域的差分信号执行快速傅里叶变换; 检测经受快速傅立叶变换的差分信号中的高频分量; 并且基于高频分量之间的梯度差来确定酒精使用的状态。因此，本发明可以识别驾驶员或操作者长时间使用酒精的状态和程度，从而可以防止在酒精的影响下驾驶或操作引起的事故。

100. 发明申请

WO2015139523A1 一种数据解码方法及装置审中-公开
公开(公告)号：WO2015139523A1
公开(公告)日：2015-09-24
申请号：PCT/CN2015/070736
申请日：2015-01-15
申请人：天地融科技股份有限公司
发明人：李东声
IPC分类号： H03K5/01
CPC分类号： G10L19/025 , G10L19/005 , G10L19/008 , G10L19/22 , G10L21/0208 , G10L21/038 , G10L25/30 , H03K5/082
摘要：一种数据解码方法及装置，涉及电子技术领域，该方法包括：通过音频接口接收正弦波，所述正弦波包括至少一个周期的波形，不同周期的波形表示不同的比特值（101）；将所述正弦波处理为第一方波，所述第一方波携带有待解码数据（102）；在所述第一方波中基于预设阈值判断是否包含毛刺波形；或者，在所述第一方波中基于自适应阈值判断是否包含毛刺波形，其中，所述自适应阈值根据所述正弦波中携带的同步头数据计算得到（103）；如果所述第一方波中包含毛刺波形，对所述第一方波进行毛刺波形去除处理，得到第二方波（104）；对所述第二方波进行解码，得到解码数据（105）。该方法和装置可以提高数据解码的准确性和成功率。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式