专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06665643B1 Method of and apparatus for animation, driven by an audio signal, of a synthesized model of a human face 有权
标题翻译：一种人脸合成模型的音频信号驱动的动画方法和装置
公开(公告)号：US06665643B1
公开(公告)日：2003-12-16
申请号：US09407027
申请日：1999-09-28
申请人： Claudio Lande , Mauro Quaglia
发明人： Claudio Lande , Mauro Quaglia
IPC分类号： G10L1304
CPC分类号： G06T9/001 , G10L2021/105
摘要： A method and an apparatus for the animation, driven by an audio signal, of a synthesised human face model are described, that allow the animation of any model complying with the ISO/IEC standard 14496 (“MPEG-4 standard”). The concerned phonemes are derived from the audio signal, and the corresponding visemes are identified within a set comprising both visemes defined by the standard and visemes typical of the language. Visemes are split into macroparameters that define shape and positions of the mouth and jaw of the model and that are associated to values indicating a difference from a neutral position. Such macroparameters are then transformed into face animaton parameters complying with the standard, the values of which define the deformation to be applied to the model in order to achieve animation.
摘要翻译：描述了一种由音频信号驱动的合成人脸模型的动画的方法和装置，其允许符合ISO / IEC标准14496（“MPEG-4标准”）的任何模型的动画。有关的音素来自音频信号，并且相应的视力在一组中被识别，包括由标准定义的视力和语言典型的视力。视角被分割成宏观参数，其定义模型的嘴和颌部的形状和位置，并且与指示与中立位置的差异的值相关联。然后将这样的宏观参数转换成符合标准的面部animaton参数，其值定义要应用于模型的变形以实现动画。

2. 发明授权

US06754630B2 Synthesis of speech from pitch prototype waveforms by time-synchronous waveform interpolation 失效
标题翻译：通过时间 - 同步波形插值从音调原型波形合成语音
公开(公告)号：US06754630B2
公开(公告)日：2004-06-22
申请号：US09191631
申请日：1998-11-13
申请人： Amitava Das , Eddie L. T. Choy
发明人： Amitava Das , Eddie L. T. Choy
IPC分类号： G10L1304
CPC分类号： G10L19/0204 , G10L25/27
摘要： In a method of synthesizing voiced speech from pitch prototype waveforms by time-synchronous waveform interpolation (TSWI), one or more pitch prototypes is extracted from a speech signal or a residue signal. The extraction process is performed in such a way that the prototype has minimum energy at the boundary. Each prototype is circularly shifted so as to be time-synchronous with the original signal. A linear phase shift is applied to each extracted prototype relative to the previously extracted prototype so as to maximize the cross-correlation between successive extracted prototypes. A two-dimensional prototype-evolving surface is constructed by unsampling the prototypes to every sample point. The two-dimensional prototype-evolving surface is re-sampled to generate a one-dimensional, synthesized signal frame with sample points defined by piecewise continuous cubic phase contour functions computed from the pitch lags and the phase shifts added to the extracted prototypes. A pre-selection filter may be applied to determine whether to abandon the TSWI technique in favor of another algorithm for the current frame. A post-selection performance measure may be obtained and compared with a predetermined threshold to determine whether the TSWI algorithm is performing adequately.
摘要翻译：在通过时间 - 同步波形插值（TSWI）从音调原型波形合成有声语音的方法中，从语音信号或残留信号中提取一个或多个音调原型。提取过程以使原型在边界处具有最小能量的方式进行。每个原型都是循环移位的，以便与原始信号保持时间同步。相对于先前提取的原型，对每个提取的原型应用线性相移，以便最大化连续提取的原型之间的互相关。通过对每个采样点的原型进行不抽样来构建二维原型演化曲面。二维原型演化曲面被重新采样以产生一维合成信号帧，其中采样点由从间距延迟计算的分段连续立方相轮廓函数和加到提取的原型上的相移定义。可以应用预选滤波器来确定是否放弃TSWI技术以有利于当前帧的另一算法。可以获得选择后性能测量并与预定阈值进行比较，以确定TSWI算法是否正在充分执行。

3. 发明授权

US06240390B1 Multi-tasking speech synthesizer 失效
标题翻译：多任务语音合成器
公开(公告)号：US06240390B1
公开(公告)日：2001-05-29
申请号：US09137958
申请日：1998-08-21
申请人： Chaur-Wen Jih
发明人： Chaur-Wen Jih
IPC分类号： G10L1304
CPC分类号： G10L13/047
摘要： A speech synthesizer and a method of synthesizing speech are provided. The speech synthesizer includes a memory unit having an interrupt vector section, a voice list section, a control program section, and a speech data section; a voice list pointer for pointing to the address in the voice list section of the memory unit where data are to be retrieved; a start address register whose content represents the starting address of a specific segment of waveform data stored in the speech data section of the memory unit; a program counter whose output is used to gain access to specific addresses in the control program section of the memory unit; a synthesizer, coupled to the memory unit, for synthesizing the retrieved speech data from the memory unit into voice data; and an interrupt controller coupled to the synthesizer, which is capable of actuating the execution of an synthesis interrupt service routine stored in the memory unit in response to an interrupt signal generated by the synthesizer. The foregoing architecture for the speech synthesizer allows the speech synthesizer to be capable of driving external devices in a multi-tasking manner while nonetheless allowing the software complexity to be simple to implement. Moreover, the architecture and method of the speech synthesizer allows the voice concatenation to be easy to implement either through hardware or through software.
摘要翻译：提供语音合成器和语音合成方法。语音合成器包括具有中断向量部分，语音列表部分，控制节目部分和语音数据部分的存储器单元; 用于指向要检索数据的存储器单元的语音列表部分中的地址的语音列表指针; 起始地址寄存器，其内容表示存储在存储器单元的语音数据部分中的波形数据的特定段的起始地址; 程序计数器，其输出用于访问存储器单元的控制程序部分中的特定地址; 合成器，耦合到所述存储器单元，用于将从所述存储器单元获取的语音数据合成为语音数据; 以及耦合到合成器的中断控制器，其能够响应于由合成器产生的中断信号来致动存储在存储器单元中的合成中断服务程序的执行。用于语音合成器的上述架构允许语音合成器能够以多任务方式驱动外部设备，同时允许软件复杂性易于实现。此外，语音合成器的架构和方法允许通过硬件或通过软件容易地实现语音级联。

4. 发明授权

US06546366B1 Text-to-speech converter 有权
标题翻译：文字转语音转换器
公开(公告)号：US06546366B1
公开(公告)日：2003-04-08
申请号：US09258507
申请日：1999-02-26
申请人： David Randall Ronca , Stephen Francis Ruhl
发明人： David Randall Ronca , Stephen Francis Ruhl
IPC分类号： G10L1304
CPC分类号： H04M3/493 , G10L13/04 , H04M3/4931 , H04M3/5307 , H04M7/12 , H04M2201/60
摘要： A text-to-speech converter includes a text-to-speech engine receiving source text and converting the source text into speech data. A read mechanism reads speech data from the text-to-speech engine and writes the speech data to a buffer. A throttle mechanism reads speech data from the buffer and conveys the speech data to a playback operation. The throttle mechanism triggers the read mechanism to read data from the text-to-speech engine and writes the speech data to the buffer so that unread speech data in the buffer remains ahead of speech data read by the throttle mechanism by at least a predetermined amount.
摘要翻译：文本到语音转换器包括接收源文本并将源文本转换为语音数据的文本到语音引擎。读取机构从文本到语音引擎读取语音数据并将语音数据写入缓冲器。节流机构从缓冲器读取语音数据并将语音数据传送到回放操作。节流机构触发阅读机制从文本到语音引擎读取数据，并将语音数据写入缓冲器，使得缓冲器中未读的语音数据保持在由节流机构读取的语音数据之前至少预定的量。

5. 发明授权

US06513007B1 Generating synthesized voice and instrumental sound 失效
标题翻译：产生合成的声音和乐器声音
公开(公告)号：US06513007B1
公开(公告)日：2003-01-28
申请号：US09619955
申请日：2000-07-20
申请人： Akio Takahashi
发明人： Akio Takahashi
IPC分类号： G10L1304
CPC分类号： G10H1/125 , G10H7/10
摘要： There is provided a synthesized sound generating apparatus and method which can achieve responsive and high-quality speech synthesis based on a real-time convolution operation. Coefficients are generated by using dynamic cutting to extract characteristic information from a first signal. A convolution operation is performed on a second signal using the generated coefficients to generate a synthesized signal. As the convolution operation, an interpolation process is performed on the coefficients to prevent a rapid change in level of the generated synthesized signal upon switching of the coefficients.
摘要翻译：提供了一种可以实现基于实时卷积运算的响应和高质量语音合成的合成声音产生装置和方法。通过使用动态切割从第一信号提取特征信息来生成系数。使用所生成的系数对第二信号执行卷积运算以产生合成信号。作为卷积运算，对系数进行内插处理，以防止在切换系数时所生成的合成信号的电平的快速变化。

6. 发明授权

US06311158B1 Synthesis of time-domain signals using non-overlapping transforms 有权
标题翻译：使用非重叠变换合成时域信号
公开(公告)号：US06311158B1
公开(公告)日：2001-10-30
申请号：US09268878
申请日：1999-03-16
申请人： Jean Laroche
发明人： Jean Laroche
IPC分类号： G10L1304
CPC分类号： G10L13/02
摘要： Techniques for synthesizing a time-domain signal. The time-domain signal is partitioned into a number of time-domain frames and a waveform in generated for each time-domain frame. Each waveform includes one or more sinusoids. The waveform is generated by selecting a sinusoid for synthesis and computing a set of parameter values (e.g. the start and end amplitude, frequency, and phase values) for the selected sinusoid. A template is determined for the selected sinusoid based on the computed parameter values and a selected window function. The frequency-domain template is such that the amplitude of the selected sinusoid in the time domain matches, at a time-domain frame boundary, the amplitude of a corresponding sinusoid in an adjacent time-domain frame. The template is added to a frequency-domain frame. The process is repeated for each sinusoid in the waveform. After all sinusoids have been processed, the frequency-domain frame is transformed to a time-domain frame. The time-domain frame is re-normalized with a re-normalization function that is generated based on the selected window function. A predetermined number of samples from each end of the time-domain frame can be discarded. The waveform is defined by the non-discarded samples in the time-domain frame. The waveforms from the time-domain frames are concatenated to generate the time-domain signal.
摘要翻译：用于合成时域信号的技术。时域信号被划分成多个时域帧和针对每个时域帧产生的波形。每个波形包括一个或多个正弦波。通过选择用于合成的正弦波并计算所选择的正弦波的一组参数值（例如，起始和结束振幅，频率和相位值）来产生波形。基于所计算的参数值和所选择的窗口函数，为所选择的正弦波确定模板。频域模板使得时域中所选择的正弦波的振幅在时域帧边界处匹配相邻时域帧中相应正弦波的振幅。模板被添加到频域框架中。波形中的每个正弦波重复该过程。在所有正弦曲线被处理之后，频域帧被转换成时域帧。使用基于所选窗口函数生成的重新归一化函数对时域帧进行重新归一化。可以丢弃来自时域帧的每一端的预定数量的采样。波形由时域帧中的未丢弃样本定义。来自时域帧的波形被级联以产生时域信号。

7. 发明授权

US06182041B2 Text-to-speech based reminder system 有权
标题翻译：基于文本到语音的提醒系统
公开(公告)号：US06182041B2
公开(公告)日：2001-01-30
申请号：US09170706
申请日：1998-10-13
申请人： Yizhi Li , Alexander S. Ng , Trung Trinh , Ross McNamara
发明人： Yizhi Li , Alexander S. Ng , Trung Trinh , Ross McNamara
IPC分类号： G10L1304
CPC分类号： H04M3/533 , G10L13/00 , H04M2201/60 , H04M2203/2016 , H04M2203/2072 , H04M2203/4536
摘要： Text-to-speech based reminder system. One or more servers within a communication network include hardware and software which allows the system to direct the collection of data about reminder, the translation of the reminder from text-to-speech, and the forwarding of the reminder to a recipient via an appropriate delivery method. The invention makes use of delivery methods such as e-mail, voicemail, or an existing telephone connection to communicate the speech reminder to a recipient. The recipient need not view a display screen to understand the content of the reminder, and needs no locally installed reminder software.
摘要翻译：基于文本到语音的提醒系统。通信网络内的一个或多个服务器包括硬件和软件，其允许系统指导关于提醒的数据的收集，将提醒从文本到语音的翻译，以及通过适当的传递将提醒转发给接收者方法。本发明利用诸如电子邮件，语音邮件或现有的电话连接等传送方式将语音提醒传送给接收者。收件人不需要查看显示屏幕以了解提醒的内容，并且不需要本地安装的提醒软件。

8. 发明授权

US06697780B1 Method and apparatus for rapid acoustic unit selection from a large speech corpus 有权
标题翻译：用于从大语音语料库中快速声学单元选择的方法和装置
公开(公告)号：US06697780B1
公开(公告)日：2004-02-24
申请号：US09557146
申请日：2000-04-25
申请人： Mark Charles Beutnagel , Mehryar Mohri , Michael Dennis Riley
发明人： Mark Charles Beutnagel , Mehryar Mohri , Michael Dennis Riley
IPC分类号： G10L1304
CPC分类号： G10L13/07
摘要： A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. The selected acoustic units are chosen to minimize a combination of target and concatenation costs for a given sentence. However, as concatenation costs, which are measures of the mismatch between sequential pairs of acoustic units, are expensive to compute, processing can be greatly reduced by pre-computing and caching the concatenation costs. Unfortunately, the number of possible sequential pairs of acoustic units makes such caching prohibitive. However, statistical experiments reveal that while about 85% of the acoustic units are typically used in common speech, less than 1% of the possible sequential pairs of acoustic units occur in practice. A method for constructing an efficient concatenation cost database is provided by synthesizing a large body of speech, identifying the acoustic unit sequential pairs generated and their respective concatenation costs, and storing those concatenation costs likely to occur. By constructing a concatenation cost database in this fashion, the processing power required at run-time is greatly reduced with negligible effect on speech quality.
摘要翻译：语音合成系统可以从声学单元的非常大的数据库中选择记录的语音片段或声学单元，以产生人造语音。选择的声学单元被选择以最小化给定句子的目标和级联成本的组合。然而，由于级联成本（即连续的声单元对之间的不匹配度量）是计算成本高的，所以可以通过预先计算和缓存级联成本大大降低处理能力。不幸的是，可能的顺序对声学单元的数量使得这种高速缓存变得过高。然而，统计学实验表明，虽然约85％的声学单位通常用于通用语音，但在实践中小于1％的可能顺序的声学单元对出现。通过合成大量语音，识别产生的声学单元序列对及其各自的级联成本，并且存储可能发生的级联成本，提供了一种用于构建有效级联成本数据库的方法。通过以这种方式构建级联成本数据库，运行时所需的处理能力大大降低，对语音质量的影响可以忽略不计。

9. 发明授权

US06691092B1 Voicing measure as an estimate of signal periodicity for a frequency domain interpolative speech codec system 有权
标题翻译：音频测量作为频域内插语音编解码系统的信号周期的估计
公开(公告)号：US06691092B1
公开(公告)日：2004-02-10
申请号：US09542390
申请日：2000-04-04
申请人： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria
发明人： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria
IPC分类号： G10L1304
CPC分类号： G10L19/097 , G10L2025/783
摘要： A system determines a voicing measure as a measure of the degree of signal periodicity and uses the determined voicing measure to quantize the spectral magnitude of the slowly evolving waveform (SEW) and the modeling of the SEW and rapidly evolving waveform (REW) phase spectra.
摘要翻译：系统将发声测量值确定为信号周期程度的度量，并使用确定的发声测量来量化慢速演化波形（SEW）的频谱幅度以及SEW和快速演变波形（REW）相位谱的建模。

10. 发明授权

US06477496B1 Signal synthesis by decoding subband scale factors from one audio signal and subband samples from different one 失效
标题翻译：通过从一个音频信号解码子带比例因子和来自不同的音频信号的子带样本的信号合成
公开(公告)号：US06477496B1
公开(公告)日：2002-11-05
申请号：US08772591
申请日：1996-12-20
申请人： Eliot M. Case
发明人： Eliot M. Case
IPC分类号： G10L1304
CPC分类号： G10L19/0208
摘要： A method, system and product are provided for synthesizing sound using encoded audio signals having a plurality of frequency subbands, each subband having a scale factor and sample data associated therewith. The method includes selecting a spectral envelope, and selecting a plurality of frequency subbands, each subband having sample data associated therewith. The method also includes generating a synthetic encoded audio signal having a plurality of frequency subbands, the subbands having the selected spectral envelope and the selected sample data. The system includes control logic for performing the method. The product includes a storage medium having computer readable programmed instructions for performing the method.
摘要翻译：提供了一种用于使用具有多个频率子带的编码音频信号来合成声音的方法，系统和产品，每个子带具有比例因子和与其相关联的采样数据。该方法包括选择频谱包络以及选择多个频率子带，每个子带具有与其相关联的采样数据。所述方法还包括生成具有多个频率子带的合成编码音频信号，所述子带具有所选择的频谱包络和所选择的采样数据。该系统包括用于执行该方法的控制逻辑。该产品包括具有用于执行该方法的计算机可读编程指令的存储介质。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式