会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Speech synthesis method
    • 语音合成方法
    • US06760703B2
    • 2004-07-06
    • US10265458
    • 2002-10-07
    • Takehiko KagoshimaMasami Akamine
    • Takehiko KagoshimaMasami Akamine
    • G10L1302
    • G10L13/07G10L25/90
    • A speech synthesis method that generates a speech pitch wave from a reference speech signal by subjecting the reference speech signal to one of Fourier transform and Fourier series expansion to produce a discrete spectrum, that interpolates the discrete spectrum to generate a consecutive spectrum, and that subjects the consecutive spectrum to inverse Fourier transform. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave. Information regarding the residual pitch wave is stored as information of a speech synthesis unit in a voice period. A speech is then synthesized using the information of the speech synthesis unit.
    • 一种语音合成方法,其通过对所述参考语音信号进行傅立叶变换和傅立叶级数展开之一来产生离散频谱,从而从所述参考语音信号生成语音基音波,其中内插离散频谱以产生连续频谱,并且所述对象 连续谱到傅里叶逆变换。 通过对参考语音信号进行线性预测分析来生成线性预测系数。 基于线性预测系数对语音音调波进行逆滤波以产生残余音调波。 关于残余音调波的信息作为语音合成单元的信息存储在语音周期中。 然后使用语音合成单元的信息来合成语音。
    • 2. 发明授权
    • Speech synthesis method
    • 语音合成方法
    • US06553343B1
    • 2003-04-22
    • US09984254
    • 2001-10-29
    • Takehiko KagoshimaMasami Akamine
    • Takehiko KagoshimaMasami Akamine
    • G01L1302
    • G10L13/07G10L25/90
    • A speech synthesis method subjects a reference speech signal to windowing to extract an aperiodic speech pitch wave from the reference speech signal. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The aperiodic speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave. Information regarding the residual pitch wave is stored as information of a speech synthesis unit and a voiced period in the storage. The speech is then synthesized using the information of the speech synthesis unit.
    • 语音合成方法使参考语音信号进行窗口化以从参考语音信号中提取非周期性语音基音波。 通过对参考语音信号进行线性预测分析来生成线性预测系数。 对非周期性语音音调波进行基于线性预测系数的反相滤波以产生残余音调波。 关于残余音调波的信息作为语音合成单元的信息和有声周期存储在存储器中。 然后使用语音合成单元的信息来合成语音。
    • 3. 发明授权
    • Speech synthesis method and speech synthesizer
    • 语音合成方法和语音合成器
    • US07251601B2
    • 2007-07-31
    • US10101689
    • 2002-03-21
    • Takehiko KagoshimaMasami Akamine
    • Takehiko KagoshimaMasami Akamine
    • G10L13/04
    • G10L13/04G10L25/27
    • A speech synthesis method comprises selecting a predetermined formant parameters from formant parameters according to a pitch pattern, phoneme duration, and phoneme symbol string, generating a plurality of sine waves based on formant frequency and formant phase of the formant parameters selected, multiplying the sine waves by windowing functions of the selected formant parameters, respectively, to generate a plurality of formant waveforms, adding the formant waveforms to generate a plurality of pitch waveforms, and superposing the pitch waveforms according to a pitch period to generate a speech signal.
    • 语音合成方法包括根据音调模式,音素持续时间和音素符号串从共振峰参数中选择预定的共振峰参数,根据所选择的共振峰参数的共振峰频率和共振峰相位产生多个正弦波,将正弦波 通过分别对所选择的共振峰参数进行窗口函数来产生多个共振峰波形,加入共振峰波形以产生多个音调波形,并根据音调周期叠加音调波形以产生语音信号。
    • 4. 发明授权
    • Clustered patterns for text-to-speech synthesis
    • 文本到语音合成的聚类模式
    • US06529874B2
    • 2003-03-04
    • US09149036
    • 1998-09-08
    • Takehiko KagoshimaTakaaki NiiShigenobu SetoMasahiro MoritaMasami AkamineYoshinori Shiga
    • Takehiko KagoshimaTakaaki NiiShigenobu SetoMasahiro MoritaMasami AkamineYoshinori Shiga
    • G10L1308
    • G10L13/10
    • A representative pattern memory stores a plurality of initial representative patterns as a noise pattern. Different attribute is affixed to each initial representative pattern. A pitch pattern memory stores a large number of natural pitch patterns as an accent phrase. A clustering unit classifies each natural pitch pattern to the initial representative pattern based on the attribute of the accent phrase. A transformation parameter generation unit calculates an error between a transformed representative pattern and each natural pitch pattern classified to the initial representative pattern. A representative pattern generation unit calculates an evaluation function of the sum of the error between the transformed-representative pattern and each natural pitch pattern classified to the initial representative pattern, and updates each initial representative pattern. The representative pattern memory stores each updated representative pattern as a clustered pattern of the attribute affixed to the corresponding initial representative pattern.
    • 代表性图案存储器将多个初始代表图案存储为噪声图案。 每个初始代表模式附加不同的属性。 音调模式存储器存储大量自然音高模式作为重音短语。 聚类单元基于重音短语的属性将每个自然音调模式分类为初始代表模式。 变换参数生成单元计算变换后的代表性图案与分类为初始代表图案的每个自然间距图案之间的误差。 代表图案生成单元计算变换代表图案与分类为初始代表图案的每个自然间距图案之间的误差之和的评估函数,并且更新每个初始代表图案。 代表性图案存储器将每个更新的代表图案存储为附加到对应的初始代表图案的属性的聚类图案。
    • 5. 发明授权
    • Interpolating between representative frame waveforms of a prediction
error signal for speech synthesis
    • 在用于语音合成的预测误差信号的代表性帧波形之间插值
    • US5890118A
    • 1999-03-30
    • US613093
    • 1996-03-08
    • Takehiko KagoshimaMasami Akamine
    • Takehiko KagoshimaMasami Akamine
    • G10L11/00G10L13/06G10L9/04
    • G10L13/07
    • A speech synthesis apparatus includes; a memory for storing a plurality of typical waveforms corresponding to a plurality of frames, the typical waveforms each previously obtained by extracting in units of at least one frame from a prediction error signal formed in predetermined units, a voiced speech source generator including an interpolation circuit for performing interpolation between the typical waveforms read out from the memory means to obtain a plurality of interpolation signals each having at least one of an interpolation pitch period and a signal level which changes smoothly between the corresponding frames, a superposition circuit for superposing the interpolation signals obtained by the interpolation circuit to form a voiced speech source signal, an unvoiced speech source generator for generating an unvoiced speech source signal, and a vocal tract filter selectively driven by the voiced speech source signal outputted from the voiced speech source generator and the unvoiced speech source signal from the unvoiced speech source generator to generate synthetic speech. Further, interpolation positions can be determined bases on the pitch period.
    • 语音合成装置包括: 用于存储对应于多个帧的多个典型波形的存储器,通过以预定单位形成的预测误差信号以至少一帧为单位提取而获得的典型波形,包括内插电路的有声语音源发生器 用于在从存储器装置读出的典型波形之间执行内插以获得多个内插信号,每个内插信号具有内插音调周期和相应帧之间平滑改变的信号电平中的至少一个;叠加电路,用于叠加插值信号 通过内插电路获得以形成有声语音源信号,用于产生无声语音源信号的无声语音源发生器和由从有声语音源发生器输出的有声语音源信号和无声语音选择性地驱动的声道滤波器 源信号来自无声 语音源生成器生成合成语音。 此外,可以基于音调周期来确定插值位置。
    • 6. 发明授权
    • Speech synthesis method
    • 语音合成方法
    • US06332121B1
    • 2001-12-18
    • US09722047
    • 2000-11-27
    • Takehiko KagoshimaMasami Akamine
    • Takehiko KagoshimaMasami Akamine
    • G10L1300
    • G10L13/07G10L25/90
    • In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.
    • 在合成单元发生器中,通过合成用语音语境和输入语音段标记的训练语音段,同时根据训练语音段的音调/持续时间来改变输入语音段的音高/持续时间来生成多个合成语音段 。 基于合成语音段和训练语音段之间的距离,从输入语音段中选择典型语音段,并存储在存储器中。 另外,根据距离生成与合成部对应的多个语音上下文集群,并存储在存储部中。 通过从存储器读出与包括输入音素的语音上下文的语音上下文群集相对应的合成单元的合成语音信号,并且在语音合成器中连接所选择的合成单位来生成合成语音信号。
    • 8. 发明授权
    • Speech synthesis method
    • 语音合成方法
    • US07184958B2
    • 2007-02-27
    • US10792888
    • 2004-03-05
    • Takehiko KagoshimaMasami Akamine
    • Takehiko KagoshimaMasami Akamine
    • G10L13/00G10L19/04
    • G10L13/07G10L25/90
    • A speech synthesis method subjects a reference speech signal to windowing to extract a speech pitch wave having a window function of a window length double a pitch period of the reference speech signal from the reference speech signal. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave, which is then stored as information of a speech synthesis unit in a voiced period in a storage. Speech using the information of the speech synthesis unit is then synthesized.
    • 语音合成方法使参考语音信号进行加窗以从参考语音信号中提取具有参考语音信号的音高周期的窗口长度双倍的窗函数的语音音调波。 通过对参考语音信号进行线性预测分析来生成线性预测系数。 语音音调波基于线性预测系数进行逆滤波以产生残余音调波,然后作为语音合成单元的信息存储在存储器中的有声周期中。 然后合成使用语音合成单元的信息的语音。
    • 9. 发明授权
    • Phonemic unit dictionary based on shifted portions of source codebook vectors, for text-to-speech synthesis
    • 基于源码本向量的偏移部分的音素单位词典,用于文本到语音合成
    • US06202048B1
    • 2001-03-13
    • US09239966
    • 1999-01-29
    • Katsumi TsuchiyaTakehiko KagoshimaMasami Akamine
    • Katsumi TsuchiyaTakehiko KagoshimaMasami Akamine
    • G10L1306
    • G10L19/12G10L13/06
    • A speech synthesis apparatus synthesize a speech signal by filtering a speech source signal through a synthesis filter. A speech source signal codebook stores a plurality of speech source signals as a code vector. A unit dictionary memory stores a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector in the speech source codebook and a shift number for the code vector to decode the speech source signal. A unit selection section selects a synthesis unit corresponding to phonemic symbols to be synthesized from the unit dictionary memory. A synthesis unit decoder selects the code vector corresponding to the index in the synthesis unit from the speech source signal codebook, and shifts the code vector according to the shift number in the synthesis unit.
    • 语音合成装置通过合成滤波器对语音源信号进行滤波来合成语音信号。 语音源信号码本存储多个语音源信号作为码矢量。 单元字典存储器存储对应于音素符号的多个合成单元,每个合成单元包括语音源码本中的码矢量的索引和用于解码语音源信号的码矢量的移位号。 单元选择部从单位字典存储器中选择与要合成的音素符号对应的合成单位。 合成单元解码器从语音源信号码本中选择与合成单元中的索引相对应的码矢量,并根据合成单元中的移位号移位码矢量。
    • 10. 发明授权
    • Text presentation apparatus, text presentation method, and computer program product
    • 文本呈现装置,文本呈现方法和计算机程序产品
    • US08655664B2
    • 2014-02-18
    • US13207575
    • 2011-08-11
    • Kentaro TachibanaGou HirabayashiTakehiko Kagoshima
    • Kentaro TachibanaGou HirabayashiTakehiko Kagoshima
    • G10L13/00G10L15/26G10L15/00G10L15/06G10L15/16G06F17/20G06F17/27G06F17/21G10L13/08G10L21/00G10L25/00
    • G10L13/08
    • According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.
    • 根据一个实施例,呈现用于语音录音的扬声器的文本的文本呈现装置包括:文本存储单元,用于存储第一文本; 用于呈现第一文本的呈现单元; 确定单元,用于基于所呈现的第一文本的说话者的输入来确定是否需要替换第一文本; 用于存储初步文本的初步文本存储单元; 选择单元,其被配置为:如果确定需要替换所述第一文本,则从所述初步文本中选择替换所述第一文本的第二文本,所述选择是基于描述所述第一文本的属性的属性信息执行的 并且基于描述第一文本的发音的属性信息和描述第一文本的应力类型的属性信息中的至少一个; 以及控制单元,被配置为控制所述呈现单元,使得所述呈现单元呈现所述第二文本。