专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

71. 发明授权

US07912712B2 Method and apparatus for encoding and decoding of background noise based on the extracted background noise characteristic parameters 有权
标题翻译：基于提取的背景噪声特性参数对背景噪声进行编码和解码的方法和装置
公开(公告)号：US07912712B2
公开(公告)日：2011-03-22
申请号：US12881926
申请日：2010-09-14
申请人： Eyal Shlomot , Libin Zhang , Jinliang Dai
发明人： Eyal Shlomot , Libin Zhang , Jinliang Dai
IPC分类号： G10L21/02 , G10L11/02
CPC分类号： G10L19/012
摘要： An encoding method includes extracting background noise characteristic parameters within a hangover period; for a first superframe after the hangover period, performing background noise encoding based on the extracted background noise characteristic parameters; for superframes after the first superframe, performing background noise characteristic parameter extraction and DTX decision for each frame in the superframes after the first superframe; and for the superframes after the first superframe, performing background noise encoding based on extracted background noise characteristic parameters of the current superframe, background noise characteristic parameters of a plurality of superframes previous to the current superframe, and a final DTX decision. Also, a decoding method and apparatus and an encoding apparatus are disclosed.
摘要翻译：编码方法包括在宿醉期内提取背景噪声特征参数; 对于在宿醉期之后的第一超帧，基于所提取的背景噪声特征参数执行背景噪声编码; 对于第一超帧之后的超帧，对第一超帧之后的超帧中的每帧执行背景噪声特性参数提取和DTX判定; 并且对于第一超帧之后的超帧，基于当前超帧的提取的背景噪声特性参数，在当前超帧之前的多个超帧的背景噪声特性参数和最终DTX判定来执行背景噪声编码。另外，公开了一种解码方法和装置以及编码装置。

72. 发明授权

US07558729B1 Music detection for enhancing echo cancellation and speech coding 有权
标题翻译：用于增强回声消除和语音编码的音乐检测
公开(公告)号：US07558729B1
公开(公告)日：2009-07-07
申请号：US11084392
申请日：2005-03-17
申请人： Adil Benyassine , Yang Gao , Carlo Murgia , Eyal Shlomot
发明人： Adil Benyassine , Yang Gao , Carlo Murgia , Eyal Shlomot
IPC分类号： G10L21/02 , G10L19/14 , G10L15/20 , H04B3/20 , H04M9/08
CPC分类号： G10L25/48 , G10L25/78
摘要： A method of using music detection to enhance an operation of an echo canceller is provided, wherein the echo canceller includes an adaptive filter and a nonlinear processor. The method comprises receiving an input signal including an echo signal by the echo canceller from a near end device, filtering the input signal using the adaptive filter to eliminate linear components of the echo signal in the input signal and generate an error signal, analyzing the error signal using a music detector to determine existence of a music signal in the error signal, bypassing the nonlinear processor if the analyzing determines the music signal exists in the error signal, and eliminating nonlinear components of the echo signal from the error signal using the nonlinear processor if the analyzing determines the music signal does not exist in the error signal.
摘要翻译：提供了一种使用音乐检测来增强回声消除器的操作的方法，其中回波消除器包括自适应滤波器和非线性处理器。该方法包括从近端设备接收包括回声消除器的回波信号的输入信号，使用自适应滤波器对输入信号进行滤波，以消除输入信号中的回波信号的线性分量并产生误差信号，分析误差使用音乐检测器的信号来确定误差信号中的音乐信号的存在，如果分析确定音乐信号存在于误差信号中，则绕过非线性处理器，并且使用非线性处理器从误差信号中消除回波信号的非线性分量如果分析确定音乐信号不存在于错误信号中。

73. 发明申请

US20090012784A1 Speech transcoding in GSM networks 有权
标题翻译： GSM网络中的语音转码
公开(公告)号：US20090012784A1
公开(公告)日：2009-01-08
申请号：US11825424
申请日：2007-07-06
申请人： Carlo Murgia , Yang Gao , Aruna Vittal , Eyal Shlomot
发明人： Carlo Murgia , Yang Gao , Aruna Vittal , Eyal Shlomot
IPC分类号： G10L19/00
CPC分类号： G10L19/173
摘要： There is provided a method of transcoding an Enhance Full Rate (EFR) 12.2 Kbps encoded frame into an Adaptive Multi-Rate (AMR) 12.2 Kbps encoded frame, where the method comprises receiving the EFR 12.2 Kbps encoded frame from a first codec; determining if the EFR 12.2 Kbps encoded frame is a Silence Insertion Descriptor (SID) frame; if the EFR 12.2 Kbps encoded frame is determined to be the SID frame, the method further comprises transcoding the EFR SID frame. There is also provided a method of transcoding an EFR 12.2 Kbps encoded frame into an AMR 12.2 Kbps encoded frame, where the method comprises receiving the AMR 12.2 Kbps encoded frame from a first codec; determining if the AMR 12.2 Kbps encoded frame is an SID frame; if the AMR 12.2 Kbps encoded frame is determined to be the SID frame, the method further comprises transcoding the AMR SID frame.
摘要翻译：提供了一种将增强全速率（EFR）12.2Kbps编码的帧转码为自适应多速率（AMR）12.2Kbps编码帧的方法，其中该方法包括从第一编解码器接收EFR12.2Kbps编码帧; 确定EFR 12.2Kbps编码帧是否是静音插入描述符（SID）帧; 如果EFR12.2Kbps编码帧被确定为SID帧，则该方法还包括对EFR SID帧进行代码转换。还提供了将EFR12.2Kbps编码帧转码为AMR 12.2Kbps编码帧的方法，其中该方法包括从第一编解码器接收AMR 12.2Kbps编码帧; 确定AMR 12.2Kbps编码帧是否是SID帧; 如果AMR 12.2Kbps编码帧被确定为SID帧，则该方法还包括对AMR SID帧进行代码转换。

74. 发明授权

US07231348B1 Tone detection algorithm for a voice activity detector 有权
标题翻译：语音活动检测器的音调检测算法
公开(公告)号：US07231348B1
公开(公告)日：2007-06-12
申请号：US11342103
申请日：2006-01-26
申请人： Yang Gao , Eyal Shlomot , Adil Benyassine
发明人： Yang Gao , Eyal Shlomot , Adil Benyassine
IPC分类号： G10L21/00
CPC分类号： G10L25/78
摘要： There is provided a voice activity detection method for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining whether each of the plurality of frames includes an active voice signal or an inactive voice signal, determining a second reflection coefficient for each frame determined to include the inactive voice signal, comparing the second reflection coefficient with a reflection threshold, and selecting the active voice mode if the second reflection coefficient is greater than the reflection threshold. The method may further comprise selecting the inactive voice mode if the second reflection coefficient is not greater than the reflection threshold. The method may also comprise analyzing the input signal to determine an energy level of the input signal, and selecting the active voice mode if the energy level is greater than an energy threshold.
摘要翻译：提供了一种用于指示主动语音模式和无效语音模式的语音活动检测方法。该方法包括接收具有多个帧的输入信号，确定多个帧中的每个帧是否包括活动语音信号或不活动语音信号，确定确定为包括不活动语音信号的每个帧的第二反射系数，将具有反射阈值的第二反射系数，以及如果第二反射系数大于反射阈值，则选择有效语音模式。该方法还可以包括：如果第二反射系数不大于反射阈值，则选择非活动语音模式。该方法还可以包括分析输入信号以确定输入信号的能级，以及如果能量级大于能量阈值则选择主动语音模式。

75. 发明授权

US07191122B1 Speech compression system and method 有权
标题翻译：语音压缩系统及方法
公开(公告)号：US07191122B1
公开(公告)日：2007-03-13
申请号：US11112394
申请日：2005-04-22
申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
IPC分类号： G10L19/12
CPC分类号： G10L19/00 , G10L19/167 , G10L19/20 , G10L19/22 , G10L19/24 , G10L2019/0001 , H03G3/00
摘要： The invention improves the encoding and decoding of speech by focusing the encoding on the perceptually important characteristics of speech. The system analyzes selected features of an input speech signal, and first performing a common frame based speech coding of an input speech signal. The system then performs a speech coding based on either a first speech coding mode or a second speech coding mode. The selection of a mode is based on characteristics of the input speech signal. The first speech coding mode uses a first framing structure and the second speech coding mode uses a second framing structure.
摘要翻译：本发明通过将编码聚焦在语音的重要特征上来改进语音的编码和解码。该系统分析输入语音信号的所选特征，并且首先对输入语音信号进行基于公共帧的语音编码。然后，该系统基于第一语音编码模式或第二语音编码模式执行语音编码。模式的选择基于输入语音信号的特性。第一语音编码模式使用第一成帧结构，第二语音编码模式使用第二帧结构。

76. 发明授权

US06959274B1 Fixed rate speech compression system and method 有权
标题翻译：固定速率语音压缩系统及方法
公开(公告)号：US06959274B1
公开(公告)日：2005-10-25
申请号：US09663662
申请日：2000-09-15
申请人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
发明人： Yang Gao , Adil Benyassine , Jes Thyssen , Eyal Shlomot , Huan-yu Su
IPC分类号： G10L13/00 , G10L13/04 , G10L19/04 , G10L19/08 , G10L19/10 , G10L19/12 , G10L19/14 , H03M7/30 , H03M7/36
CPC分类号： G10L19/00 , G10L19/167 , G10L19/20 , G10L19/22 , G10L19/24 , G10L2019/0001 , H03G3/00
摘要： The invention improves the encoding and decoding of speech by focusing the encoding on the perceptually important characteristics of speech. The system analyzes selected features of an input speech signal, and first performing a common frame based speech coding of an input speech signal. The system then performs a speech coding based on either a first speech coding mode or a second speech coding mode. The selection of a mode is based on characteristics of the input speech signal. The first speech coding mode uses a first framing structure and the second speech coding mode uses a second framing structure.
摘要翻译：本发明通过将编码聚焦在语音的重要特征上来改进语音的编码和解码。该系统分析输入语音信号的所选特征，并且首先对输入语音信号进行基于公共帧的语音编码。然后，该系统基于第一语音编码模式或第二语音编码模式执行语音编码。模式的选择基于输入语音信号的特性。第一语音编码模式使用第一成帧结构，第二语音编码模式使用第二帧结构。

77. 发明申请

US20050010405A1 Complexity resource manager for multi-channel speech processing 有权
标题翻译：用于多声道语音处理的复杂性资源管理器
公开(公告)号：US20050010405A1
公开(公告)日：2005-01-13
申请号：US10911118
申请日：2004-08-03
申请人： Eyal Shlomot , Huan-Yu Su
发明人： Eyal Shlomot , Huan-Yu Su
IPC分类号： G10L15/28 , G10L19/02
CPC分类号： G10L15/285
摘要： A multi-channel speech processor for encoding speech in a packet network environment is disclosed. In one illustrative aspect, a complexity resource manager (CRM) is executed by a controller or processor. The CRM manages the level of complexity of encoding which is used by a signal processing unit (SPU) to convert the speech signal into packet data. In general, the CRM determines the level of complexity of encoding based on a calculated complexity budget, where the complexity budget is determined based on the time required to process prior speech signal channels and the time available to process the remaining channels. In this way, the CRM is able to control the overall complexity of the speech processor through its ability to signal the SPU to encode speech signal in a complexity reduced mode based on the calculated complexity budget under certain conditions.
摘要翻译：公开了一种用于在分组网络环境中编码语音的多声道语音处理器。在一个说明性方面，复杂性资源管理器（CRM）由控制器或处理器执行。 CRM管理由信号处理单元（SPU）用于将语音信号转换成分组数据的编码的复杂程度。通常，CRM基于计算的复杂度预算确定编码的复杂程度，其中基于处理先前语音信号信道所需的时间和可用于处理剩余信道的时间来确定复杂度预算。以这种方式，CRM能够通过其在特定条件下基于计算的复杂度预算在复杂度降低模式下对SPU进行信号编码语音信号的能力来控制语音处理器的总体复杂性。

78. 发明授权

US06721712B1 Conversion scheme for use between DTX and non-DTX speech coding systems 有权
标题翻译： DTX与非DTX语音编码系统之间的转换方案
公开(公告)号：US06721712B1
公开(公告)日：2004-04-13
申请号：US10057250
申请日：2002-01-24
申请人： Adil Benyassine , Eyal Shlomot , Huan-Yu Su
发明人： Adil Benyassine , Eyal Shlomot , Huan-Yu Su
IPC分类号： G10L1900
CPC分类号： H04W88/181 , G10L19/173
摘要： In an exemplary conversion scheme, a frame of a first speech signal comprising a plurality of frames encoded at a plurality of first rates, including a first non-speech rate, is received. The rate of the received frame is determined, and if the received frame is encoded at the first non-speech rate, then the received frame is re-encoded at either a second or third non-speech rate to generate a frame of a second speech signal. Moreover, a system for converting a speech signal comprises a receiver for receiving a frame of a first speech signal and a processor capable of determining the encoding rate of the received frame and re-encoding the received frame at either a second or third non-speech rate if the received frame was originally encoded at a first non-speech rate.
摘要翻译：在示例性转换方案中，接收包括以包括第一非语音速率在内的多个第一速率编码的多个帧的第一语音信号的帧。确定接收帧的速率，并且如果以第一非语音速率对接收的帧进行编码，则接收的帧以第二或第三非语音速率被重新编码，以产生第二语音的帧信号。此外，用于转换语音信号的系统包括用于接收第一语音信号的帧的接收机和能够确定接收到的帧的编码速率并且在第二或第三非语音中重新编码接收的帧的处理器如果接收的帧最初以第一非语音速率被编码，则速率。

79. 发明授权

US06574593B1 Codebook tables for encoding and decoding 有权
标题翻译：用于编码和解码的码表
公开(公告)号：US06574593B1
公开(公告)日：2003-06-03
申请号：US09663837
申请日：2000-09-15
申请人： Yang Gao , Adil Benyassine , Huan-yu Su , Eyal Shlomot , Jes Thyssen
发明人： Yang Gao , Adil Benyassine , Huan-yu Su , Eyal Shlomot , Jes Thyssen
IPC分类号： G10L1912
CPC分类号： G10L19/00 , G10L19/167 , G10L19/24 , H03G3/00
摘要： A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.
摘要翻译：公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。语音压缩系统包括全速率编解码器，半速率编解码器，四分之一速率编解码器和八速率编解码器。基于速率选择来选择性地激活编解码器。此外，基于类型分类，全速率和半速率编解码器被选择性地激活。选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码，以增强合成语音的整体质量。

80. 发明授权

US06275794B1 System for detecting voice activity and background noise/silence in a speech signal using pitch and signal to noise ratio information 有权
标题翻译：用于使用音调和信噪比信息检测语音信号中的语音活动和背景噪声/静音的系统
公开(公告)号：US06275794B1
公开(公告)日：2001-08-14
申请号：US09218334
申请日：1998-12-22
申请人： Adil Benyassine , Eyal Shlomot
发明人： Adil Benyassine , Eyal Shlomot
IPC分类号： G10L1104
CPC分类号： G10L25/78
摘要： A method and apparatus for generating frame voicing decisions for an incoming speech signal having periods of active voice and non-active voice for a speech encoder in a speech communications system. A predetermined set of parameters is extracted from the incoming speech signal, including a pitch gain and a pitch lag. A frame voicing decision is made for each frame of the incoming speech signal according to values calculated from the extracted parameters. The predetermined set of parameters further includes a partial residual frame full band energy, and a set of spectral parameters called Line Spectral Frequencies (LSF). A signal-to-noise value is estimated and tracked to adaptively set threshold values, thereby improving performance under various noise conditions.
摘要翻译：一种用于在语音通信系统中为语音编码器的有效语音和非有效语音周期的输入语音信号生成帧语音决定的方法和装置。从输入语音信号中提取预定的一组参数，包括音调增益和音调滞后。根据从提取的参数计算的值，对输入语音信号的每个帧进行帧发声决定。预定的一组参数还包括部分残余帧全频带能量和称为线谱频率（LSF）的一组频谱参数。估计和跟踪信噪比以自适应地设置阈值，从而改善在各种噪声条件下的性能。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式