专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06735567B2 Encoding and decoding speech signals variably based on signal classification 有权
标题翻译：基于信号分类对语音信号进行编码和解码
公开(公告)号：US06735567B2
公开(公告)日：2004-05-11
申请号：US10409430
申请日：2003-04-08
申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
IPC分类号： G10L1304
CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00
摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.
摘要翻译：公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

2. 发明授权

US06581032B1 Bitstream protocol for transmission of encoded voice signals 有权
标题翻译：用于传输编码语音信号的比特流协议
公开(公告)号：US06581032B1
公开(公告)日：2003-06-17
申请号：US09662828
申请日：2000-09-15
申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
IPC分类号： G10L1912
CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00
摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.
摘要翻译：公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

3. 发明申请

US20090043574A1 Speech coding system and method using bi-directional mirror-image predicted pulses 有权
标题翻译：使用双向镜像预测脉冲的语音编码系统和方法
公开(公告)号：US20090043574A1
公开(公告)日：2009-02-12
申请号：US12284623
申请日：2008-09-23
申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
IPC分类号： G10L19/12 , G10L19/00
CPC分类号： G10L19/00 , G10L19/167 , G10L19/20 , G10L19/22 , G10L19/24 , G10L2019/0001 , H03G3/00
摘要： There is provided a method of decoding speech data generated from a speech signal. The method comprises receiving the speech data having at least one main pulse in a subframe of the speech data; generating a first predicted pulse, based on the at least one main pulse, on one side of the main pulse in the subframe of the speech data, wherein the first predicted pulse has a lower gain than the main pulse; generating a second predicted pulse, as a mirror image of the first predicted pulse on a reverse time scale, on the other side of the main pulse in the subframe of the speech data; reconstructing the speech signal using the at least one main pulse, the first predicted pulse and the second predicted pulse.
摘要翻译：提供了一种对从语音信号产生的语音数据进行解码的方法。该方法包括：接收语音数据的子帧中具有至少一个主脉冲的语音数据; 基于所述至少一个主脉冲在所述语音数据的子帧中的所述主脉冲的一侧产生第一预测脉冲，其中所述第一预测脉冲具有比所述主脉冲更低的增益; 在语音数据的子帧中的主脉冲的另一侧上产生第二预测脉冲作为反时限上的第一预测脉冲的镜像; 使用所述至少一个主脉冲，所述第一预测脉冲和所述第二预测脉冲来重构所述语音信号。

4. 发明授权

US07346502B2 Adaptive noise state update for a voice activity detector 有权
标题翻译：语音活动检测器的自适应噪声状态更新
公开(公告)号：US07346502B2
公开(公告)日：2008-03-18
申请号：US11342130
申请日：2006-01-26
申请人： Yang Gao , Eyal Shlomot , Adil Benyassine
发明人： Yang Gao , Eyal Shlomot , Adil Benyassine
IPC分类号： G10L11/06
CPC分类号： G10L25/78 , G10L2025/786
摘要： There is provided a method of updating a noise state of a voice activity detector (VAD) for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining an elapsed time since the last update of the noise state, updating the noise state of the VAD if the elapsed time exceeds a predetermined time, determining an average minimum energy based on two or more of the plurality of frames, determining a current minimum energy based on a current frame of the plurality of frames, updating the noise state of the VAD if the average minimum energy is less than the current minimum energy, and updating the noise state of the VAD if the average minimum energy is greater than the current minimum energy plus a first predetermined value.
摘要翻译：提供了一种更新用于指示主动语音模式和无效语音模式的语音活动检测器（VAD）的噪声状态的方法。该方法包括接收具有多个帧的输入信号，确定自上次更新噪声状态以来经过的时间，如果经过时间超过预定时间，则更新VAD的噪声状态，基于二次确定平均最小能量或更多个帧，基于多个帧的当前帧确定当前最小能量，如果平均最小能量小于当前最小能量，则更新VAD的噪声状态，并且更新噪声状态 VAD，如果平均最小能量大于当前最小能量加上第一预定值。

5. 发明授权

US06961698B1 Multi-mode bitstream transmission protocol of encoded voice signals with embeded characteristics 失效
标题翻译：具有嵌入特性的编码语音信号的多模比特流传输协议
公开(公告)号：US06961698B1
公开(公告)日：2005-11-01
申请号：US10420654
申请日：2003-04-21
申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
IPC分类号： G10L19/00 , G10L13/00 , G10L13/04 , G10L19/02 , G10L19/04 , G10L19/08 , G10L19/10 , G10L19/12 , G10L19/14 , H03M7/30 , H03M7/36
CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00
摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The bitstream comprises a type component and a gain component. The type component is representative of a type classification of a frame of speech signal that is transmitted. The type component comprises a first type and second type. The gain component represents an adaptive codebook gain and a fixed codebook gain component comprises a fixed codebook gain component and an adaptive codebook gain component exclusively encoded as separate components of the bitstream as a function of the bit rate when the type classification is the second type.
摘要翻译：公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。比特流包括类型分量和增益分量。类型分量代表传输的语音信号帧的类型分类。类型组件包括第一类型和第二类型。增益分量表示自适应码本增益，并且固定码本增益分量包括固定码本增益分量和自适应码本增益分量，该类型分类作为第二类型时，作为比特率的单独分量专门编码。

6. 发明授权

US06463414B1 Conference bridge processing of speech in a packet network environment 有权
标题翻译：会议桥处理语音在分组网环境中
公开(公告)号：US06463414B1
公开(公告)日：2002-10-08
申请号：US09547832
申请日：2000-04-12
申请人： Huan-Yu Su , Eyal Shlomot , Jes Thyssen , Adil Benyassine , Yang Gao
发明人： Huan-Yu Su , Eyal Shlomot , Jes Thyssen , Adil Benyassine , Yang Gao
IPC分类号： G10L1102
CPC分类号： G10L19/173
摘要： There is provided a conference bridge or transcoder configured to intelligently handle multiple speech channels in the contest of a packet network, wherein various speech channels may adhere to variety of speech encoding standards. For example, the conference bridge establishes framing and alignment of multiple incoming speech channels associated with multiple participants, extracts parameters from the speech samples, mixes the parameters, and re-encodes the resulting speech samples for transmission to the participants. In one aspect, a speech processing method comprises decoding a first bitstream according to a first coding scheme to generate first speech samples and a first side information; generating second speech samples and a second side information using the first speech samples and the first side information, for use according to a second coding scheme; and creating a second bitstream, encoded based on the second coding scheme, using the second speech samples and the second side information.
摘要翻译：提供了一种配置成在分组网络的比赛中智能地处理多个语音信道的会议桥或代码转换器，其中各种语音信道可以遵循各种语音编码标准。例如，会议桥建立与多个参与者相关联的多个输入语音信道的成帧和对准，从语音样本中提取参数，混合参数，并对所得到的语音样本进行重新编码以传输给参与者。一方面，语音处理方法包括根据第一编码方案对第一比特流进行解码，以产生第一语音样本和第一侧信息; 使用第一语音样本和第一侧信息生成第二语音样本和第二侧信息，以便根据第二编码方案使用; 以及使用所述第二语音样本和所述第二侧信息来创建基于所述第二编码方案编码的第二比特流。

7. 发明授权

US08195450B2 Decoder with embedded silence and background noise compression 有权
标题翻译：解码器具有嵌入式静音和背景噪声压缩
公开(公告)号：US08195450B2
公开(公告)日：2012-06-05
申请号：US13199794
申请日：2011-09-08
申请人： Eyal Shlomot , Yang Gao , Adil Benyassine
发明人： Eyal Shlomot , Yang Gao , Adil Benyassine
IPC分类号： G10L21/00 , G10L11/06
CPC分类号： G10L19/24 , G10L19/012 , G10L19/0208
摘要： There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.
摘要翻译：提供了一种由语音编码器用于对输入语音信号进行编码的方法。该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码，以生成编码窄带无效语音; 基于窄带无效语音信号，由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码，以根据来自窄带无源语音编码器的低到高辅助信号产生编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

8. 发明授权

US07711107B1 Perceptual masking of residual echo 有权
标题翻译：残余回声的感知掩蔽
公开(公告)号：US07711107B1
公开(公告)日：2010-05-04
申请号：US11129450
申请日：2005-05-12
申请人： Carlo Murgia , Jeffrey D. Klein , Adil Benyassine , Eyal Shlomot , Yang Gao
发明人： Carlo Murgia , Jeffrey D. Klein , Adil Benyassine , Eyal Shlomot , Yang Gao
IPC分类号： H04M9/08
CPC分类号： H04B3/234
摘要： A method of masking a residual echo signal by an echo canceller is provided. The method comprises receiving a far-end signal, adjusting filter coefficients of an adaptive filter in response to the far-end signal, generating an echo model signal based on the far-end signal using the adaptive filter, receiving a near-end signal, subtracting the echo model signal from the near-end signal to generate an output signal, defining a spectral mask based on the near-end signal, wherein the spectral mask is indicative of near-end spectral peaks and near-end spectral valleys, de-emphasizing the output signal in spectral regions of the near-end spectral peaks, and emphasizing the output signal in spectral regions of the near-end spectral valleys, wherein the de-emphasizing occurs during filter coefficients determination for the adaptive filter. A weighted filter may perform the de-emphasizing and the emphasizing operations, where the weighted filter uses medium term spectral characteristics of the near-end signal.
摘要翻译：提供了一种通过回波消除器掩蔽残留回波信号的方法。该方法包括接收远端信号，响应于远端信号调整自适应滤波器的滤波器系数，使用自适应滤波器基于远端信号生成回波模型信号，接收近端信号，从近端信号减去回波模型信号以产生输出信号，基于近端信号定义频谱屏蔽，其中频谱掩模表示近端谱峰和近端谱谷，强调近端光谱峰值的光谱区域中的输出信号，并且强调近端光谱谷的光谱区域中的输出信号，其中在自适应滤波器的滤波器系数确定期间发生去加重。加权滤波器可以执行去强调和强调操作，其中加权滤波器使用近端信号的中期频谱特性。

9. 发明申请

US20080195383A1 Embedded silence and background noise compression 有权
标题翻译：嵌入式静音和背景噪声压缩
公开(公告)号：US20080195383A1
公开(公告)日：2008-08-14
申请号：US12002131
申请日：2007-12-14
申请人： Eyal Shlomot , Yang Gao , Adil Benyassine
发明人： Eyal Shlomot , Yang Gao , Adil Benyassine
IPC分类号： G10L19/14
CPC分类号： G10L19/24 , G10L19/012 , G10L19/0208
摘要： There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.
摘要翻译：提供了一种由语音编码器用于对输入语音信号进行编码的方法。该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码，以生成编码窄带无效语音; 基于窄带无效语音信号，由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码，以根据来自窄带无源语音编码器的低到高辅助信号产生编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

10. 发明申请

US20060217973A1 Adaptive voice mode extension for a voice activity detector 有权
标题翻译：语音活动检测器的自适应语音模式扩展
公开(公告)号：US20060217973A1
公开(公告)日：2006-09-28
申请号：US11342104
申请日：2006-01-26
申请人： Yang Gao , Eyal Shlomot , Adil Benyassine
发明人： Yang Gao , Eyal Shlomot , Adil Benyassine
IPC分类号： G10L19/12
CPC分类号： G10L25/78 , G10L2025/786
摘要： There is provided a voice activity detection method for indicating an active voice mode and an inactive voice mode. The method comprises receiving a first portion of an input signal; determining that the first portion of the input signal includes an active voice signal; indicating the active voice mode in response to the determining that the first portion of the input signal includes the active voice signal; receiving a second portion of the input signal immediately following the first portion of the input signal; determining that the second portion of the input signal includes an inactive voice signal; extending the indicating the active voice mode for a period of time after determining that the second portion of the input signal includes the inactive voice signal, wherein the period of time varies based on one or more conditions; and indicating the inactive voice mode after expiration of the period of time.
摘要翻译：提供了一种用于指示主动语音模式和无效语音模式的语音活动检测方法。该方法包括接收输入信号的第一部分; 确定输入信号的第一部分包括有效语音信号; 响应于确定输入信号的第一部分包括有效语音信号，指示主动语音模式; 接收紧接在输入信号的第一部分之后的输入信号的第二部分; 确定输入信号的第二部分包括不活动的语音信号; 在确定所述输入信号的第二部分包括所述不活动语音信号之后，将所述主动语音模式指示一段时间，其中所述时间段基于一个或多个条件而变化; 并且在该时间段期满之后指示不活动的语音模式。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式