会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Method of and apparatus for animation, driven by an audio signal, of a synthesized model of a human face
    • 一种人脸合成模型的音频信号驱动的动画方法和装置
    • US06665643B1
    • 2003-12-16
    • US09407027
    • 1999-09-28
    • Claudio LandeMauro Quaglia
    • Claudio LandeMauro Quaglia
    • G10L1304
    • G06T9/001G10L2021/105
    • A method and an apparatus for the animation, driven by an audio signal, of a synthesised human face model are described, that allow the animation of any model complying with the ISO/IEC standard 14496 (“MPEG-4 standard”). The concerned phonemes are derived from the audio signal, and the corresponding visemes are identified within a set comprising both visemes defined by the standard and visemes typical of the language. Visemes are split into macroparameters that define shape and positions of the mouth and jaw of the model and that are associated to values indicating a difference from a neutral position. Such macroparameters are then transformed into face animaton parameters complying with the standard, the values of which define the deformation to be applied to the model in order to achieve animation.
    • 描述了一种由音频信号驱动的合成人脸模型的动画的方法和装置,其允许符合ISO / IEC标准14496(“MPEG-4标准”)的任何模型的动画。 有关的音素来自音频信号,并且相应的视力在一组中被识别,包括由标准定义的视力和语言典型的视力。 视角被分割成宏观参数,其定义模型的嘴和颌部的形状和位置,并且与指示与中立位置的差异的值相关联。 然后将这样的宏观参数转换成符合标准的面部animaton参数,其值定义要应用于模型的变形以实现动画。
    • 2. 发明授权
    • Synthesis of speech from pitch prototype waveforms by time-synchronous waveform interpolation
    • 通过时间 - 同步波形插值从音调原型波形合成语音
    • US06754630B2
    • 2004-06-22
    • US09191631
    • 1998-11-13
    • Amitava DasEddie L. T. Choy
    • Amitava DasEddie L. T. Choy
    • G10L1304
    • G10L19/0204G10L25/27
    • In a method of synthesizing voiced speech from pitch prototype waveforms by time-synchronous waveform interpolation (TSWI), one or more pitch prototypes is extracted from a speech signal or a residue signal. The extraction process is performed in such a way that the prototype has minimum energy at the boundary. Each prototype is circularly shifted so as to be time-synchronous with the original signal. A linear phase shift is applied to each extracted prototype relative to the previously extracted prototype so as to maximize the cross-correlation between successive extracted prototypes. A two-dimensional prototype-evolving surface is constructed by unsampling the prototypes to every sample point. The two-dimensional prototype-evolving surface is re-sampled to generate a one-dimensional, synthesized signal frame with sample points defined by piecewise continuous cubic phase contour functions computed from the pitch lags and the phase shifts added to the extracted prototypes. A pre-selection filter may be applied to determine whether to abandon the TSWI technique in favor of another algorithm for the current frame. A post-selection performance measure may be obtained and compared with a predetermined threshold to determine whether the TSWI algorithm is performing adequately.
    • 在通过时间 - 同步波形插值(TSWI)从音调原型波形合成有声语音的方法中,从语音信号或残留信号中提取一个或多个音调原型。 提取过程以使原型在边界处具有最小能量的方式进行。 每个原型都是循环移位的,以便与原始信号保持时间同步。 相对于先前提取的原型,对每个提取的原型应用线性相移,以便最大化连续提取的原型之间的互相关。 通过对每个采样点的原型进行不抽样来构建二维原型演化曲面。 二维原型演化曲面被重新采样以产生一维合成信号帧,其中采样点由从间距延迟计算的分段连续立方相轮廓函数和加到提取的原型上的相移定义。 可以应用预选滤波器来确定是否放弃TSWI技术以有利于当前帧的另一算法。 可以获得选择后性能测量并与预定阈值进行比较,以确定TSWI算法是否正在充分执行。
    • 3. 发明授权
    • Multi-tasking speech synthesizer
    • 多任务语音合成器
    • US06240390B1
    • 2001-05-29
    • US09137958
    • 1998-08-21
    • Chaur-Wen Jih
    • Chaur-Wen Jih
    • G10L1304
    • G10L13/047
    • A speech synthesizer and a method of synthesizing speech are provided. The speech synthesizer includes a memory unit having an interrupt vector section, a voice list section, a control program section, and a speech data section; a voice list pointer for pointing to the address in the voice list section of the memory unit where data are to be retrieved; a start address register whose content represents the starting address of a specific segment of waveform data stored in the speech data section of the memory unit; a program counter whose output is used to gain access to specific addresses in the control program section of the memory unit; a synthesizer, coupled to the memory unit, for synthesizing the retrieved speech data from the memory unit into voice data; and an interrupt controller coupled to the synthesizer, which is capable of actuating the execution of an synthesis interrupt service routine stored in the memory unit in response to an interrupt signal generated by the synthesizer. The foregoing architecture for the speech synthesizer allows the speech synthesizer to be capable of driving external devices in a multi-tasking manner while nonetheless allowing the software complexity to be simple to implement. Moreover, the architecture and method of the speech synthesizer allows the voice concatenation to be easy to implement either through hardware or through software.
    • 提供语音合成器和语音合成方法。 语音合成器包括具有中断向量部分,语音列表部分,控制节目部分和语音数据部分的存储器单元; 用于指向要检索数据的存储器单元的语音列表部分中的地址的语音列表指针; 起始地址寄存器,其内容表示存储在存储器单元的语音数据部分中的波形数据的特定段的起始地址; 程序计数器,其输出用于访问存储器单元的控制程序部分中的特定地址; 合成器,耦合到所述存储器单元,用于将从所述存储器单元获取的语音数据合成为语音数据; 以及耦合到合成器的中断控制器,其能够响应于由合成器产生的中断信号来致动存储在存储器单元中的合成中断服务程序的执行。 用于语音合成器的上述架构允许语音合成器能够以多任务方式驱动外部设备,同时允许软件复杂性易于实现。 此外,语音合成器的架构和方法允许通过硬件或通过软件容易地实现语音级联。
    • 5. 发明授权
    • Generating synthesized voice and instrumental sound
    • 产生合成的声音和乐器声音
    • US06513007B1
    • 2003-01-28
    • US09619955
    • 2000-07-20
    • Akio Takahashi
    • Akio Takahashi
    • G10L1304
    • G10H1/125G10H7/10
    • There is provided a synthesized sound generating apparatus and method which can achieve responsive and high-quality speech synthesis based on a real-time convolution operation. Coefficients are generated by using dynamic cutting to extract characteristic information from a first signal. A convolution operation is performed on a second signal using the generated coefficients to generate a synthesized signal. As the convolution operation, an interpolation process is performed on the coefficients to prevent a rapid change in level of the generated synthesized signal upon switching of the coefficients.
    • 提供了一种可以实现基于实时卷积运算的响应和高质量语音合成的合成声音产生装置和方法。 通过使用动态切割从第一信号提取特征信息来生成系数。 使用所生成的系数对第二信号执行卷积运算以产生合成信号。 作为卷积运算,对系数进行内插处理,以防止在切换系数时所生成的合成信号的电平的快速变化。
    • 6. 发明授权
    • Synthesis of time-domain signals using non-overlapping transforms
    • 使用非重叠变换合成时域信号
    • US06311158B1
    • 2001-10-30
    • US09268878
    • 1999-03-16
    • Jean Laroche
    • Jean Laroche
    • G10L1304
    • G10L13/02
    • Techniques for synthesizing a time-domain signal. The time-domain signal is partitioned into a number of time-domain frames and a waveform in generated for each time-domain frame. Each waveform includes one or more sinusoids. The waveform is generated by selecting a sinusoid for synthesis and computing a set of parameter values (e.g. the start and end amplitude, frequency, and phase values) for the selected sinusoid. A template is determined for the selected sinusoid based on the computed parameter values and a selected window function. The frequency-domain template is such that the amplitude of the selected sinusoid in the time domain matches, at a time-domain frame boundary, the amplitude of a corresponding sinusoid in an adjacent time-domain frame. The template is added to a frequency-domain frame. The process is repeated for each sinusoid in the waveform. After all sinusoids have been processed, the frequency-domain frame is transformed to a time-domain frame. The time-domain frame is re-normalized with a re-normalization function that is generated based on the selected window function. A predetermined number of samples from each end of the time-domain frame can be discarded. The waveform is defined by the non-discarded samples in the time-domain frame. The waveforms from the time-domain frames are concatenated to generate the time-domain signal.
    • 用于合成时域信号的技术。 时域信号被划分成多个时域帧和针对每个时域帧产生的波形。 每个波形包括一个或多个正弦波。 通过选择用于合成的正弦波并计算所选择的正弦波的一组参数值(例如,起始和结束振幅,频率和相位值)来产生波形。 基于所计算的参数值和所选择的窗口函数,为所选择的正弦波确定模板。 频域模板使得时域中所选择的正弦波的振幅在时域帧边界处匹配相邻时域帧中相应正弦波的振幅。 模板被添加到频域框架中。 波形中的每个正弦波重复该过程。 在所有正弦曲线被处理之后,频域帧被转换成时域帧。 使用基于所选窗口函数生成的重新归一化函数对时域帧进行重新归一化。 可以丢弃来自时域帧的每一端的预定数量的采样。 波形由时域帧中的未丢弃样本定义。 来自时域帧的波形被级联以产生时域信号。
    • 8. 发明授权
    • Method and apparatus for rapid acoustic unit selection from a large speech corpus
    • 用于从大语音语料库中快速声学单元选择的方法和装置
    • US06697780B1
    • 2004-02-24
    • US09557146
    • 2000-04-25
    • Mark Charles BeutnagelMehryar MohriMichael Dennis Riley
    • Mark Charles BeutnagelMehryar MohriMichael Dennis Riley
    • G10L1304
    • G10L13/07
    • A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. The selected acoustic units are chosen to minimize a combination of target and concatenation costs for a given sentence. However, as concatenation costs, which are measures of the mismatch between sequential pairs of acoustic units, are expensive to compute, processing can be greatly reduced by pre-computing and caching the concatenation costs. Unfortunately, the number of possible sequential pairs of acoustic units makes such caching prohibitive. However, statistical experiments reveal that while about 85% of the acoustic units are typically used in common speech, less than 1% of the possible sequential pairs of acoustic units occur in practice. A method for constructing an efficient concatenation cost database is provided by synthesizing a large body of speech, identifying the acoustic unit sequential pairs generated and their respective concatenation costs, and storing those concatenation costs likely to occur. By constructing a concatenation cost database in this fashion, the processing power required at run-time is greatly reduced with negligible effect on speech quality.
    • 语音合成系统可以从声学单元的非常大的数据库中选择记录的语音片段或声学单元,以产生人造语音。 选择的声学单元被选择以最小化给定句子的目标和级联成本的组合。 然而,由于级联成本(即连续的声单元对之间的不匹配度量)是计算成本高的,所以可以通过预先计算和缓存级联成本大大降低处理能力。 不幸的是,可能的顺序对声学单元的数量使得这种高速缓存变得过高。 然而,统计学实验表明,虽然约85%的声学单位通常用于通用语音,但在实践中小于1%的可能顺序的声学单元对出现。 通过合成大量语音,识别产生的声学单元序列对及其各自的级联成本,并且存储可能发生的级联成本,提供了一种用于构建有效级联成本数据库的方法。 通过以这种方式构建级联成本数据库,运行时所需的处理能力大大降低,对语音质量的影响可以忽略不计。
    • 10. 发明授权
    • Signal synthesis by decoding subband scale factors from one audio signal and subband samples from different one
    • 通过从一个音频信号解码子带比例因子和来自不同的音频信号的子带样本的信号合成
    • US06477496B1
    • 2002-11-05
    • US08772591
    • 1996-12-20
    • Eliot M. Case
    • Eliot M. Case
    • G10L1304
    • G10L19/0208
    • A method, system and product are provided for synthesizing sound using encoded audio signals having a plurality of frequency subbands, each subband having a scale factor and sample data associated therewith. The method includes selecting a spectral envelope, and selecting a plurality of frequency subbands, each subband having sample data associated therewith. The method also includes generating a synthetic encoded audio signal having a plurality of frequency subbands, the subbands having the selected spectral envelope and the selected sample data. The system includes control logic for performing the method. The product includes a storage medium having computer readable programmed instructions for performing the method.
    • 提供了一种用于使用具有多个频率子带的编码音频信号来合成声音的方法,系统和产品,每个子带具有比例因子和与其相关联的采样数据。 该方法包括选择频谱包络以及选择多个频率子带,每个子带具有与其相关联的采样数据。 所述方法还包括生成具有多个频率子带的合成编码音频信号,所述子带具有所选择的频谱包络和所选择的采样数据。 该系统包括用于执行该方法的控制逻辑。 该产品包括具有用于执行该方法的计算机可读编程指令的存储介质。