专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07454330B1 Method and apparatus for speech encoding and decoding by sinusoidal analysis and waveform encoding with phase reproducibility 失效
标题翻译：通过正弦分析和具有相位再现性的波形编码进行语音编码和解码的方法和装置
公开(公告)号：US07454330B1
公开(公告)日：2008-11-18
申请号：US08736546
申请日：1996-10-24
申请人： Masayuki Nishiguchi , Kazuyuki Iijima , Jun Matsumoto , Shiro Omori
发明人： Masayuki Nishiguchi , Kazuyuki Iijima , Jun Matsumoto , Shiro Omori
IPC分类号： G10L19/14
CPC分类号： G10L19/0212 , G10L19/02 , G10L19/04 , G10L19/06 , G10L19/12 , G10L25/27 , G10L25/93
摘要： A speech encoding method and apparatus in which an input speech signal is divided in terms of blocks or frames as encoding units and encoded in terms of the encoding units, whereby explosive and fricative consonants can be impeccably reproduced, while there is an attenuation of the occurrence of foreign sounds being generated at a transient portion between voiced (V) and unvoiced (UV) portions, so that the speech with high clarity devoid of “stuffed” feeling may be produced. The encoding apparatus includes a first encoding unit for finding residuals of linear predictive coding (LPC) of an input speech signal for performing harmonic coding and a second encoding unit for encoding the input speech signal by waveform coding. The first encoding unit and the second encoding unit are used for encoding a voiced (V) portion and an unvoiced (UV) portion of the input signal, respectively. Code excited linear prediction (CELP) encoding employing vector quantization by a closed loop search of an optimum vector using an analysis-by-synthesis method is used for the second encoding unit. A corresponding decoding method and apparatus is also provided.
摘要翻译：一种语音编码方法和装置，其中输入语音信号以块或帧为单位编码，并以编码单位编码，由此可以无可挑剔地复制爆炸和摩擦辅音，同时存在衰减的发生在V（V）和无声（UV）部分之间的瞬态部分产生外来声音，从而可能产生具有高“透明度”感的语音。编码装置包括：第一编码单元，用于求出用于执行谐波编码的输入语音信号的线性预测编码（LPC）的残差;以及第二编码单元，用于通过波形编码对输入的语音信号进行编码。第一编码单元和第二编码单元分别用于对输入信号的有声（V）部分和无声（UV）部分进行编码。第二编码单元使用通过使用合成分析法的最佳向量的闭环搜索采用矢量量化的码激励线性预测（CELP）编码。还提供了相应的解码方法和装置。

2. 发明授权

US5930747A Pitch extraction method and device utilizing autocorrelation of a plurality of frequency bands 失效
标题翻译：使用多个频带的自相关的音调提取方法和装置
公开(公告)号：US5930747A
公开(公告)日：1999-07-27
申请号：US788194
申请日：1997-01-24
申请人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
发明人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
IPC分类号： G10L11/04 , H03H17/02 , G10L9/08
CPC分类号： G10L25/90 , G10L25/06 , G10L25/18
摘要： A pitch extraction method and apparatus whereby the pitch of a speech signal having various characteristics can be extracted accurately. The frame-based input speech signal, band-limited by an HPF 12 and an LPF 16, is sent to autocorrelation computing units 13, 17 where autocorrelation data is found. The pitch lag is computed and normalized in the pitch intensity/pitch lag computing units 14, 18. The pitch reliability of the input speech signals, limited by the HPF 12 and the LPF 16, is computed in elevation parameter calculation units. A selection unit 20 selects one of the parameters obtained from the input speech signal, limited by the HPF 12 and the LPF 16, using the pitch lag and the evaluation parameter.
摘要翻译：可以精确地提取具有各种特征的语音信号的音高的音调提取方法和装置。由HPF 12和LPF 16进行带限制的基于帧的输入语音信号被发送到自相关计算单元13,17，其中找到自相关数据。在音调强度/音调滞后计算单元14,18中计算和归一化音调滞后。由高次参数计算单元计算由HPF 12和LPF 16限制的输入语音信号的音调可靠性。选择单元20使用音调滞后和评估参数从由HPF 12和LPF 16限制的输入语音信号中获得的参数之一进行选择。

3. 发明授权

US5819212A Voice encoding method and apparatus using modified discrete cosine transform 失效
标题翻译：使用修正离散余弦变换的语音编码方法和装置
公开(公告)号：US5819212A
公开(公告)日：1998-10-06
申请号：US736507
申请日：1996-10-24
申请人： Jun Matsumoto , Shiro Omori , Masayuki Nishiguchi , Kazuyuki Iijima
发明人： Jun Matsumoto , Shiro Omori , Masayuki Nishiguchi , Kazuyuki Iijima
IPC分类号： G10L19/02 , G10L19/04 , G10L19/07 , G10L9/00
CPC分类号： G10L19/0212 , G10L19/0208 , G10L19/04 , G10L19/07
摘要： A method and apparatus for encoding an input signal, such as a broad-range speech signal, in which a number of decoding operations with different bit rates are enabled for assuring a high encoding bit rate and for minimizing deterioration of the reproduced sound even with a low bit rate. The signal encoding method includes a band-splitting step for splitting an input signal into a number of bands and a step of encoding signals of the bands in a different manner depending on signal characteristics of the bands. Specifically, a low-range side signal is taken out by a low-pass filter from an input signal entering a terminal, and analyzed for Linear Predictive coding by an Linear Predictive coding analysis quantization unit. After finding the Linear Predictive coding residuals, as short-term prediction residuals by an Linear Predictive coding inverted filter, the pitch is found by a pitch analysis circuit. Then, pitch residuals are found by long-term prediction by a pitch inverted filter. The pitch residuals are processed with modified discrete cosine transform by a modified discrete cosine transform (MDCT) circuit and vector-quantized by a vector-quantization circuit. The resulting quantization indices are transmitted along with the pitch lag and the pitch gain. The linear spectral pairs linear spectral pairs are also sent as parameter representing LPC coefficients.
摘要翻译：一种用于编码诸如宽范围语音信号的输入信号的方法和装置，其中能够使用不同比特率的多个解码操作用于确保高编码比特率，并且即使使用低比特率。信号编码方法包括用于将输入信号分割成多个频带的频带分解步骤和根据频带的信号特性以不同方式编码频带的信号的步骤。具体地，通过低通滤波器从进入终端的输入信号中取出低范围侧信号，并通过线性预测编码分析量化单元分析线性预测编码。在找到线性预测编码残差之后，通过线性预测编码反相滤波器作为短期预测残差，音调由音调分析电路找到。然后，通过音调反向滤波器的长期预测来发现音调残差。用经修正的离散余弦变换（MDCT）电路，用修正离散余弦变换处理音调残差，并由矢量量化电路进行矢量量化。产生的量化索引与音调滞后和音调增益一起发送。线性谱对线性谱对也作为表示LPC系数的参数发送。

4. 发明授权

US5899966A Speech decoding method and apparatus to control the reproduction speed by changing the number of transform coefficients 失效
标题翻译：语音解码方法和装置，用于通过改变变换系数的数量来控制再现速度
公开(公告)号：US5899966A
公开(公告)日：1999-05-04
申请号：US736211
申请日：1996-10-25
申请人： Jun Matsumoto , Masayuki Nishiguchi , Shiro Omori , Kazuyuki Iijima
发明人： Jun Matsumoto , Masayuki Nishiguchi , Shiro Omori , Kazuyuki Iijima
IPC分类号： G10L19/08 , G10L11/00 , G10L19/00 , G10L19/02 , G10L19/12 , G10L21/04 , H03M7/30 , G10L3/02
CPC分类号： G10L21/04 , G10L19/0212 , G10L19/12 , G10L25/27
摘要： A signal decoding method and apparatus in which the speech signal reproducing speed is controlled without changing the phoneme or the pitch, in which the apparatus has a data number convertor for converting the number of orthogonal transform coefficients entering a transmission signal input terminal from N to M, an inverse orthogonal transform unit for inverse orthogonal-transforming the M number of the orthogonal transform coefficients obtained by the data number convertor, and a linear predictive coding synthesis filter for performing predictive synthesis based on the short-term prediction residuals obtained by the inverse orthogonal transform unit. For an input signal, short-term prediction residuals are found and are orthogonally transformed to form the orthogonal transform coefficients at a rate of N coefficients per transform unit. The frequency positions of the N transform coefficients may be rearranged to M values by M/N or by oversampling to change N to M. A portable radio terminal embodying the invention is described.
摘要翻译：一种信号解码方法和装置，其中在不改变音素或音调的情况下控制语音信号再现速度，其中装置具有数据数转换器，用于将进入发送信号输入端的正交变换系数的数目从N转换为M ，用于对由数据数转换器获得的M个正交变换系数进行逆正交变换的逆正交变换单元和用于基于由逆正交获得的短期预测残差执行预测合成的线性预测编码合成滤波器变换单元。对于输入信号，发现短期预测残差并且以每变换单位N个系数的速率进行正交变换以形成正交变换系数。 N个变换系数的频率位置可以通过M / N重排为M个值，或者通过过采样将N改变为M.描述体现本发明的便携式无线电终端。

5. 发明授权

US5848387A Perceptual speech coding using prediction residuals, having harmonic magnitude codebook for voiced and waveform codebook for unvoiced frames 失效
标题翻译：使用预测残差的感知语音编码，具有用于无声帧的有声和波形码本的谐波幅度码本
公开(公告)号：US5848387A
公开(公告)日：1998-12-08
申请号：US736987
申请日：1996-10-25
申请人： Masayuki Nishiguchi , Kazuyuki Iijima , Jun Matsumoto , Shiro Omori
发明人： Masayuki Nishiguchi , Kazuyuki Iijima , Jun Matsumoto , Shiro Omori
IPC分类号： G10L19/08 , G10L19/00 , G10L19/04 , G10L19/06 , H03M7/30 , G10L9/14
CPC分类号： G10L19/06
摘要： A speech encoding method and apparatus for encoding an input speech signal on a block-by-block or frame-by-frame basis wherein short-term prediction residuals are found and then sinusoidal analytic encoding parameters are produced based on those short-term prediction residuals. Perceptually weighted vector quantization is performed for voiced blocks or frames by encoding their sinusoidal frequency or analytic harmonic magnitudes and, in the case of unvoiced blocks or frames, the time waveforms of the unvoiced blocks are encoded.
摘要翻译：一种语音编码方法和装置，用于在逐块或逐帧的基础上编码输入语音信号，其中发现短期预测残差，然后基于那些短期预测残差产生正弦分析编码参数。通过对有声块或帧的正弦频率或分析谐波幅度进行编码来进行感知加权矢量量化，并且在无声块或帧的情况下，对无声块的时间波形进行编码。

6. 发明授权

US5752222A Speech decoding method and apparatus 失效
标题翻译：语音解码方法及装置
公开(公告)号：US5752222A
公开(公告)日：1998-05-12
申请号：US736342
申请日：1996-10-23
申请人： Masayuki Nishiguchi , Kazuyuki Iijima , Jun Matsumoto , Shiro Omori
发明人： Masayuki Nishiguchi , Kazuyuki Iijima , Jun Matsumoto , Shiro Omori
IPC分类号： G10L19/00 , G10L19/14 , H03M7/30 , G10L3/02 , G10L9/00
CPC分类号： G10L19/26
摘要： A speech decoding method and apparatus for decoding encoded speech signals and subsequently post-filtering the decoded signals, wherein the filter coefficient of a spectral shaping filter in a post-filter fed with an encoded and subsequently decoded speech signal is updated with a sub-frame period, while the gain of a gain adjustment circuit for correcting gain changes caused by the spectral shaping is updated with a frame period that is eight times as long as the sub-frame period. This achieves switching of the filter coefficient so as to be changed smoothly with a higher follow-up speed, while suppressing level changes otherwise caused by frequent gain switching. The result is improved characteristics of a post-filter used for spectral shaping of a decoded signal supplied from the signal decoder and more effective post-filter processing.
摘要翻译：一种用于对编码的语音信号进行解码并随后对解码的信号进行后置滤波的语音解码方法和装置，其中，用编码和随后解码的语音信号馈送的后置滤波器中的频谱整形滤波器的滤波器系数用子帧而用于校正由频谱整形引起的增益变化的增益调整电路的增益是以子帧周期的8倍的帧周期来更新的。这实现了滤波器系数的切换以便以更高的跟随速度平滑地改变，同时抑制另外由频繁增益切换引起的电平变化。结果是用于从信号解码器提供的解码信号的频谱整形的后置滤波器的改进的特性以及更有效的后置滤波处理。

7. 发明授权

US06023671A Voiced/unvoiced decision using a plurality of sigmoid-transformed parameters for speech coding 失效
标题翻译：使用多个S形变换参数进行语音编码的发声/清音决定
公开(公告)号：US06023671A
公开(公告)日：2000-02-08
申请号：US833970
申请日：1997-04-11
申请人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
发明人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
IPC分类号： G10L11/00 , G10L11/02 , G10L11/06 , G10L15/02 , G10L19/00 , H03M7/30 , G10L9/00
CPC分类号： G10L25/93
摘要： A method and apparatus for voiced/unvoiced decision for judging whether an input speech signal is voiced or unvoiced. The input parameters for performing the voiced/unvoiced (V/UV) decision are comprehensively judged in order to enable high-precision V/UV decision by a simplified algorithm. Parameters for the voiced/unvoiced (V/UV) decision include the frame-averaged energy of the input speech signal lev, the normalized autocorrelation peak value r0r, the spectral similarity degree pos, the number of zero crossings nZero, and the pitch lag pch. If these parameters are denoted by x, these parameters are converted by function calculation circuits using a sigmoid function g(x) represented byg(x)=A/(1+exp (-(x-b)/a))where A, a, and b are constants differing with each input parameter. Using the parameters converted by this sigmoid function g(x), the voiced/unvoiced decision is made a V/UV decision circuit.
摘要翻译：用于用于判断输入语音信号是有声还是无声的有声/无声决定的方法和装置。综合判断用于执行有声/无声（V / UV）判定的输入参数，以便通过简化算法实现高精度V / UV判定。有声/无声（V / UV）决定的参数包括输入语音信号lev的帧平均能量，归一化自相关峰值r0r，频谱相似度pos，过零次数nZero和音调滞后pch 。如果这些参数由x表示，这些参数由函数计算电路使用由g（x）= A /（1 + exp（ - （xb）/ a））表示的S形函数g（x）转换，其中A，a， b是与每个输入参数不同的常数。使用由该S形函数g（x）转换的参数，将有声/无声决定作为V / UV判定电路。

8. 发明授权

US5873059A Method and apparatus for decoding and changing the pitch of an encoded speech signal 失效
标题翻译：用于对编码语音信号进行解码和改变音调的方法和装置
公开(公告)号：US5873059A
公开(公告)日：1999-02-16
申请号：US736989
申请日：1996-10-25
申请人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
发明人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
IPC分类号： G10L11/06 , G10L13/00 , G10L13/02 , G10L19/04 , G10L21/04 , H03M7/30 , G10L9/00
CPC分类号： G10L13/033 , G10L21/01 , G10L19/087
摘要： A method and apparatus for reproducing speech signals at a controlled speed and for synthesizing speech includes a dividing unit that divides the input speech into time segments and an encoding unit that discriminates whether each of the speech segments is voiced or unvoiced. Based on the results of the discrimination, the encoding unit performs sinusoidal synthesis and encoding for voiced segments and vector quantization by closed-loop search for an optimum vector using an analysis-by-synthesis method for unvoiced segments in order to find encoded parameters. A period modification unit modifies the length of time associated with each signal segment and calculates a set of modified encoded parameters. In the speech synthesizing unit, encoded speech signal data is output from the encoding unit and pitch data and amplitude data specifying the spectral envelope are sent via a data conversion unit to a waveform synthesis unit, where the number of amplitude data points of the spectral envelope is changed without changing the shape of the spectral envelope, so that the pitch of the signal may be varied without changing its phoneme. A waveform synthesis unit synthesizes the speech waveform based on the converted spectral envelope data and pitch data.
摘要翻译：用于以受控速度再现语音信号并用于合成语音的方法和装置包括将输入语音划分成时间段的分割单元和鉴别每个语音段是有声还是无声的编码单元。基于鉴别的结果，编码单元通过使用用于清音段的合成分析方法对最佳向量进行闭环搜索，对浊音段和矢量量化进行正弦合成和编码，以便找到编码参数。周期修改单元修改与每个信号段相关联的时间长度，并计算一组经修改的编码参数。在语音合成单元中，编码语音信号数据从编码单元输出，音调数据和指定频谱包络的振幅数据经由数据转换单元发送到波形合成单元，其中频谱包络的振幅数据点的数量在不改变频谱包络的形状的情况下改变，使得信号的音调可以改变而不改变其音素。波形合成单元基于转换的频谱包络数据和音调数据来合成语音波形。

9. 发明授权

US5828996A Apparatus and method for encoding/decoding a speech signal using adaptively changing codebook vectors 失效
标题翻译：使用自适应变化的码本矢量对语音信号进行编码/解码的装置和方法
公开(公告)号：US5828996A
公开(公告)日：1998-10-27
申请号：US736988
申请日：1996-10-25
申请人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
发明人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto , Shiro Omori
IPC分类号： G10L19/08 , G10L19/04 , G10L19/12 , H03M7/30 , G10L7/02
CPC分类号： G10L19/04 , G10L19/12
摘要： An encoding apparatus in which an input speech signal is divided into blocks and encoded in units of blocks. The encoding apparatus includes an encoding unit for performing CELP encoding having a noise codebook memory containing having codebook vectors generated by clipping Gaussian noise and codebook vectors obtained by learning using the code vectors generated by clipping the Gaussian noise as initial values. The encoding apparatus enables optimum encoding for a variety of speech configurations.
摘要翻译：一种编码装置，其中输入语音信号被分成块并以块为单位编码。该编码装置包括编码单元，用于执行CELP编码，该编码单元具有噪声码本存储器，该噪声码本存储器包含通过使用通过限幅高斯噪声产生的代码矢量进行学习而获得的通过削波高斯噪声和码本矢量生成的码本矢量作为初始值。编码装置能够对各种语音配置进行最佳编码。

10. 发明授权

US6012023A Pitch detection method and apparatus uses voiced/unvoiced decision in a frame other than the current frame of a speech signal 失效
标题翻译：间距检测方法和装置在除语音信号的当前帧之外的帧中使用有声/无声决定
公开(公告)号：US6012023A
公开(公告)日：2000-01-04
申请号：US927823
申请日：1997-09-11
申请人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto
发明人： Kazuyuki Iijima , Masayuki Nishiguchi , Jun Matsumoto
IPC分类号： G10L11/04 , G10L11/06 , G10L19/02 , G10L19/04 , H03M7/30 , G10L9/00
CPC分类号： G10L25/90 , G10L19/0212 , G10L19/09
摘要： For realizing high-precision pitch detection even for speech signals in which half-pitch or double-pitch exhibits stronger autocorrelation than the pitch to be detected, an input speech signal is judged as to voicedness or unvoicedness and a voiced portion and an unvoiced portion of the input speech signal are encoded by a sinusoidal analytic encoding unit 114 and by a code excitation encoding unit 120, respectively, for producing respective encoded outputs. The sinusoidal analytic encoding unit 114 performs pitch search on the encoded outputs for finding the pitch information from the input speech signal and sets the high-reliability pitch information based on the detected pitch information. The results of pitch detection are determined using the high-reliability pitch information and the results of decision voicedness/unvoicedness of the frames other than the current frame.
摘要翻译：为了实现即使对于半间距或双音调表现出比要检测的音调更强的自相关性的语音信号的高精度音调检测，输入语音信号被判断为浊音或清音，以及有声部分和无声部分输入语音信号由正弦分析编码单元114和代码激励编码单元120分别编码，用于产生相应的编码输出。正弦分析编码单元114对编码的输出执行音调搜索，以从输入语音信号中找出音调信息，并且基于检测到的音调信息设置高可靠性音调信息。使用高可靠性音调信息和除了当前帧之外的帧的决定浊音/清音的结果来确定音调检测的结果。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式