专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

21. 发明申请

US20080107348A1 DYNAMIC QUANTIZER STRUCTURES FOR EFFICIENT COMPRESSION 有权
标题翻译：用于有效压缩的动态量子结构
公开(公告)号：US20080107348A1
公开(公告)日：2008-05-08
申请号：US11855778
申请日：2007-09-14
申请人： Jani Nurminen , Sakari Himanen
发明人： Jani Nurminen , Sakari Himanen
IPC分类号： G06K9/46
CPC分类号： H04N19/126 , G10L19/032 , H04N19/46
摘要： A method and system are introduced that provide dynamic quantizer structures which are configurable during run time. A quantizer configuration and data are stored in a binary format. The dynamic quantizer data is represented as a bitstream, and the bitstream in turn is used as additional input during initialization (or re-initialization/re-configuration) of a speech coder. A configuration header fully specifies the structure and configuration of the dynamic quantizer for each quantized parameter, and the dynamic quantizer data and configurations are fully and dynamically allocated into the speech coder memory. This enables easy re-configuration of a codec associated with the quantizer structures for different scenarios. The use of dynamic quantizer structures in turn enhances compression efficiency of an input signal. The dynamic quantizer structures can also be applied to other compression applications that allow lossy compression.
摘要翻译：引入了一种提供在运行时可配置的动态量化器结构的方法和系统。量化器配置和数据以二进制格式存储。动态量化器数据被表示为比特流，并且在语音编码器的初始化（或重新初始化/重新配置）期间，比特流又被用作附加输入。配置头完全指定每个量化参数的动态量化器的结构和配置，动态量化器数据和配置被完全和动态地分配到语音编码器存储器中。这使得能够容易地重新配置与用于不同场景的量化器结构相关联的编解码器。动态量化器结构的使用又提高了输入信号的压缩效率。动态量化器结构也可以应用于允许有损压缩的其他压缩应用。

22. 发明申请

US20070011009A1 Supporting a concatenative text-to-speech synthesis 审中-公开
标题翻译：支持连贯的文本到语音合成
公开(公告)号：US20070011009A1
公开(公告)日：2007-01-11
申请号：US11177250
申请日：2005-07-08
申请人： Jani Nurminen , Sakari Himanen , Anssi Ramo , Janne Vainio
发明人： Jani Nurminen , Sakari Himanen , Anssi Ramo , Janne Vainio
IPC分类号： G10L13/08
CPC分类号： G10L13/06
摘要： The invention relates to a support of a concatenative TTS synthesis. In order to generate a speech database as a basis for the TTS synthesis, first, a speech processing including a segmental parametric speech encoding of speech data based on a parametric modeling of speech is performed, which results in compressed parameterized speech segments. Then, the compressed parameterized speech segments are assembled in a speech database. In order to synthesize output speech, compressed parameterized speech segments are selected from the speech database based on an available text and decompressed to regain parameterized speech segments. The parameterized speech segments are then concatenated in a parameter domain. The output speech is synthesized based on these concatenated parametric speech segments.
摘要翻译：本发明涉及一种级联TTS合成的支持。为了生成语音数据库作为TTS综合的基础，首先，执行包括基于语音的参数建模的语音数据的分段参数语音编码的语音处理，这导致压缩的参数化语音段。然后，压缩的参数化语音段被组合在语音数据库中。为了合成输出语音，基于可用文本从语音数据库中选择压缩的参数化语音段，并且解压缩以重新获得参数化语音段。参数化语音段然后在参数域中连接。基于这些连接的参数语音段来合成输出语音。

23. 发明申请

US20050091044A1 Method and system for pitch contour quantization in audio coding 审中-公开
标题翻译：音频编码中音调轮廓量化的方法和系统
公开(公告)号：US20050091044A1
公开(公告)日：2005-04-28
申请号：US10692291
申请日：2003-10-23
申请人： Anssi Ramo , Jani Nurminen , Sakari Himanen , Ari Heikkinen
发明人： Anssi Ramo , Jani Nurminen , Sakari Himanen , Ari Heikkinen
IPC分类号： G10L11/04 , G10L19/02 , H03M20060101
CPC分类号： G10L19/032 , G10L19/09
摘要： A method and device for improving coding efficiency in audio coding. From the pitch values of a pitch contour of an audio signal, a plurality of simplified pitch contour segments are generated to approximate the pitch contour, based on one or more pre-selected criteria. The contour segments can be linear or non-linear with each contour segment represented by a first end point and a second end point. If the contour segments are linear, then only the information regarding the end points, instead of the pitch values, are provided to a decoder for reconstructing the audio signal. The contour segment can have a fixed maximum length or a variable length, but the deviation between a contour segment and the pitch values in that segment is limited by a maximum value.
摘要翻译：一种提高音频编码效率的方法和装置。根据音频信号的音调轮廓的音调值，基于一个或多个预先选择的标准，生成多个简化俯仰轮廓线段以近似俯仰轮廓。轮廓段可以是由第一终点和第二终点表示的每个轮廓段线性或非线性的。如果轮廓段是线性的，则仅将关于终点而不是音调值的信息提供给用于重建音频信号的解码器。轮廓段可以具有固定的最大长度或可变长度，但轮廓段与该段中的俯仰值之间的偏差受到最大值的限制。

24. 发明申请

US20050091041A1 Method and system for speech coding 审中-公开
标题翻译：语音编码方法和系统
公开(公告)号：US20050091041A1
公开(公告)日：2005-04-28
申请号：US10692290
申请日：2003-10-23
申请人： Anssi Ramo , Jani Nurminen , Sakari Himanen , Ari Heikkinen
发明人： Anssi Ramo , Jani Nurminen , Sakari Himanen , Ari Heikkinen
IPC分类号： G10L20060101 , G10L11/06 , G10L19/02 , G10L19/04 , G10L19/14 , G10L21/04 , H04B1/06 , H04M11/00
CPC分类号： G10L19/24
摘要： A method and device for use in conjunction with an encoder for encoding an audio signal into a plurality of parameters. Based on the behavior of the parameters, such as pitch, voicing, energy and spectral amplitude information of the audio signal, the audio signal can be segmented, so that the parameter update rate can be optimized. The parameters of the segmented audio signal are recorded in a storage medium or transmitted to a decoder so as to allow the decoder to reconstruct the audio signal based on the parameters indicative of the segment audio signals. For example, based on the pitch characteristic, the pitch contour can be approximated by a plurality of contour segments. An adaptive downsampling method is used to update the parameters based on the contour segments so as to reduce the update rate. At the decoder, the parameters are updated at the original rate.
摘要翻译：一种与用于将音频信号编码为多个参数的编码器结合使用的方法和装置。基于音频信号的音调，发音，能量和频谱幅度信息等参数的行为，可以对音频信号进行分段，从而可以优化参数更新速率。分段音频信号的参数被记录在存储介质中或被发送到解码器，以便允许解码器基于指示段音频信号的参数重建音频信号。例如，基于俯仰特性，俯仰轮廓可以由多个轮廓段近似。使用自适应下采样方法根据轮廓段更新参数，以便降低更新速率。在解码器处，参数以原始速率更新。

25. 发明授权

US06801887B1 Speech coding exploiting the power ratio of different speech signal components 失效
标题翻译：语音编码利用不同语音信号分量的功率比
公开(公告)号：US06801887B1
公开(公告)日：2004-10-05
申请号：US09666971
申请日：2000-09-20
申请人： Ari Heikkinen , Mikko Tammi , Jani Nurminen
发明人： Ari Heikkinen , Mikko Tammi , Jani Nurminen
IPC分类号： G10L1914
CPC分类号： G10L19/097 , G10L19/24
摘要： A method and system for waveform interpolation speech coding. The method comprises the steps of decomposing the speech signal into a slowly evolving waveform component and a rapidly evolving waveform component in the encoder and determining the power ratio of these surface components so that the power ratio can be used to determine the bit allocation when the surface components are quantized. The power ratio can also be used to modify the phases of the slowly evolving waveform component when the surface components are reconstructed in the decoder in order to improve the speech quality.
摘要翻译：一种用于波形插值语音编码的方法和系统。该方法包括以下步骤：将语音信号分解成编码器中缓慢演变的波形分量和快速演变的波形分量，并确定这些表面分量的功率比，使得当表面的比特分配时可以使用功率比来确定比特分配组分被量化。当在解码器中重构表面分量以便改善语音质量时，功率比也可用于修改缓慢演变的波形分量的相位。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式