专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明专利

JP2010049196A Voice conversion apparatus and method, and speech synthesis apparatus and method 有权
标题翻译：语音转换装置和方法，以及语音合成装置和方法
公开(公告)号：JP2010049196A
公开(公告)日：2010-03-04
申请号：JP2008215711
申请日：2008-08-25
申请人： Toshiba Corp , 株式会社東芝
发明人： TAMURA MASANORI , MORITA SHINKO , KAGOSHIMA TAKEHIKO
IPC分类号： G10L21/04
CPC分类号： G10L13/033 , G10L2021/0135
摘要： PROBLEM TO BE SOLVED: To provide a voice conversion method and apparatus, capable of easily creating voice with high quality having voice quality of target speech, from a small amount of target speech. SOLUTION: A source speech spectrum parameter for expressing characteristics of voice quality is extracted from input source speech. The source speech parameter is converted to a first conversion spectrum parameter by using a voice quality conversion rule (which is a rule for converting the voice quality of the source speech, to the voice quality of the target speech). A target speech spectrum parameter which is similar to the first conversion spectrum parameter is selected, from a plurality of target speech spectrum parameters stored in a storage means. An aperiodic component spectrum parameter for expressing an aperiodic parameter of the voice quality is created from the selected target speech spectrum parameter. A second conversion spectrum parameter is created by mixing a periodic component spectrum parameter with the aperiodic component spectrum parameter for expressing a periodic component of the voice quality included in the first conversion spectrum parameter. COPYRIGHT: (C)2010,JPO&INPIT
摘要翻译：要解决的问题：提供一种语音转换方法和装置，能够从少量的目标语音容易地创建具有目标语音的语音质量的高质量的语音。解决方案：从输入源语音中提取用于表达语音质量特征的源语音频谱参数。通过使用语音质量转换规则（其是用于将源语音的语音质量转换为目标语音的语音质量的规则）将源语音参数转换为第一转换频谱参数。从存储在存储装置中的多个目标语音频谱参数中选择类似于第一转换频谱参数的目标语音频谱参数。从所选择的目标语音频谱参数创建用于表达语音质量的非周期参数的非周期分量频谱参数。通过将周期性分量频谱参数与非周期分量频谱参数混合以产生包括在第一转换频谱参数中的语音质量的周期分量来创建第二转换频谱参数。版权所有（C）2010，JPO＆INPIT

2. 发明专利

JP2008249808A Speech synthesizer, speech synthesizing method and program 有权
标题翻译：语音合成器，语音合成方法和程序
公开(公告)号：JP2008249808A
公开(公告)日：2008-10-16
申请号：JP2007087857
申请日：2007-03-29
申请人： Toshiba Corp , 株式会社東芝
发明人： MORITA SHINKO , KAGOSHIMA TAKEHIKO
IPC分类号： G10L13/06
CPC分类号： G10L13/07
摘要： PROBLEM TO BE SOLVED: To provide a speech synthesizer capable of speedily and suitably selecting a speech element sequence for a synthesis unit string, under restriction for the synthesis unit string regarding speech element data acquisition from each storing medium whose data acquisition speed is different from each other. SOLUTION: A speech synthesis section 4 includes a high speed storing medium 42 and a low speed storing medium 44, and a first speech element storing section 43 and a speech element attribute information storing section 46 are arranged in the high speed storing medium 42, and a second speech element storing section 45 is arranged in the low speed storing medium 44. A speech element selection section 47 calculates a penalty coefficient for an evaluation value of the speech element sequence, which is determined on the basis of the restriction regarding phoneme data acquisition, and a statistical amount regarding speech element data acquisition for the speech element included in the speech element sequence, for each of speech element sequence candidates developed at the time of a certain synthesis unit, and selects a suitable candidate from the speech element sequence candidates, by using the evaluation value and the penalty coefficient. COPYRIGHT: (C)2009,JPO&INPIT
摘要翻译：要解决的问题：提供一种语音合成器，其能够在数据获取速度为每个存储介质的每个存储介质中从关于语音元素数据获取的合成单元串的限制下，快速且适当地选择合成单元串的语音单元序列彼此不同。解决方案：语音合成部分4包括高速存储介质42和低速存储介质44，并且第一语音元素存储部分43和语音元素属性信息存储部分46被布置在高速存储介质 42和第二语音元素存储部分45被布置在低速存储介质44中。语音元素选择部分47计算语音元素序列的评估值的惩罚系数，其基于关于音素数据采集，以及关于用于语音元素序列中包括的语音元素的语音元素数据采集的统计量，对于在某个合成单元时刻产生的每个语音元素序列候选，并且从语音元素中选择合适的候选者序列候选者，通过使用评价值和罚分系数。版权所有（C）2009，JPO＆INPIT

3. 发明专利

JP2006276528A Voice synthesizer and method thereof 有权
标题翻译：语音合成器及其方法
公开(公告)号：JP2006276528A
公开(公告)日：2006-10-12
申请号：JP2005096526
申请日：2005-03-29
申请人： Toshiba Corp , 株式会社東芝
发明人： TAMURA MASANORI , HIRABAYASHI TAKESHI , KAGOSHIMA TAKEHIKO
IPC分类号： G10L13/06 , G10L13/08
CPC分类号： G10L13/07
摘要： PROBLEM TO BE SOLVED: To provide a high quality voice synthesizer by which power information of a large-scale voice element is appropriately reflected and pieces of power information of voice elements in each voice section become natural and stable one in voice synthesis of an element selection type or a multiple element selection type. SOLUTION: A voice synthesis part 14 is constituted of a voice element storage part 21, a phonemic environment storage part 22, a phonological sequence/prosodic information input part 23, a multiple voice element selection part 24, a fused voice element sequence creation part 25 and a fused voice element editing/connection part 26, and generates the fused voice element by fusing the plurality of selected elements in the fused voice element sequence creation part 25. In the fused voice element sequence creation part 25, average power information about a plurality of the selected M voice elements is calculated, N voice elements are fused and power information of the generated fused voice elements is corrected so that it becomes the average power information of the M voice elements. COPYRIGHT: (C)2007,JPO&INPIT
摘要翻译：要解决的问题：提供一种高质量的语音合成器，通过该高质量语音合成器，大规模语音元素的功率信息被适当地反映，并且每个语音部分中的语音元素的功率信息的片段在语音合成中变得自然而稳定元素选择类型或多元素选择类型。解决方案：语音合成部分14由语音元素存储部分21，音素环境存储部分22，语音序列/韵律信息输入部分23，多声音元素选择部分24，融合语音元素序列创建部分25和融合语音元素编辑/连接部分26，并且通过融合融合语音元素序列创建部分25中的多个所选择的元素来生成融合语音元素。在融合语音元素序列创建部分25中，平均功率信息计算多个所选择的M个语音元素，N个语音元素被融合，并且校正所生成的融合语音元素的功率信息，使其成为M个语音元素的平均功率信息。版权所有（C）2007，JPO＆INPIT

4. 发明专利

JP2004054063A Method and device for basic frequency pattern generation, speech synthesizing device, basic frequency pattern generating program, and speech synthesizing program 有权
公开(公告)号：JP2004054063A
公开(公告)日：2004-02-19
申请号：JP2002213188
申请日：2002-07-22
申请人： Toshiba Corp , 株式会社東芝
发明人： HIRABAYASHI TAKESHI , KAGOSHIMA TAKEHIKO , TOKUDA RYUTARO
IPC分类号： G10L13/08
摘要： PROBLEM TO BE SOLVED: To provide a basic frequency pattern generating method capable of generating a basic frequency pattern of a voice close to the basic frequency pattern of a voice that a person utters.
SOLUTION: A storage means is stored with a plurality of representative patterns representing by statistics static features at respective time-series points and dynamic features, representing features of variation of the static features, which constitute typical basic frequency patterns by meter control units as units of a voice having a time length of more than one syllable for controlling phonologic features of a voice corresponding to a text, a representative pattern corresponding to the text is selected out of the plurality of representative patterns stored in the storage means, and on the basis of statistics of the static features and statistics of the dynamic features of the selected representative pattern, the most likelihood estimation of the basic frequency pattern of the voice corresponding to the text is carried out.
COPYRIGHT: (C)2004,JPO

5. 发明专利

JP2003348589A Method and device for encoding moving image, and method and device for detecting moving image motion 有权
标题翻译：用于编码移动图像的方法和装置，以及用于检测移动图像运动的方法和装置
公开(公告)号：JP2003348589A
公开(公告)日：2003-12-05
申请号：JP2003148133
申请日：2003-05-26
申请人： Toshiba Corp , 株式会社東芝
发明人： KAGOSHIMA TAKEHIKO , WATANABE TOSHIAKI , KIKUCHI YOSHIHIRO , NAKAJO TAKESHI
IPC分类号： H04N19/50 , H04N19/423 , H04N19/46 , H04N19/503 , H04N19/51 , H04N19/527 , H04N19/59 , H04N19/70 , H04N7/32
摘要： PROBLEM TO BE SOLVED: To provide a moving image encoding device that does not increase a code amount necessary to residual encoding with little omission of a predictive image in applying global motion compensation to the entire image frame to form the predictive image. SOLUTION: This moving image encoding device has: a global motion parameter detection circuit 21 for detecting a motion parameter showing the motion of the entire image frame; an image memory 16 for storing image signals in the image frame and image signals around the image frame as a reference image signal; a predictive image generation circuit 23 for generating a predictive image signal by using the motion parameter to apply motion compensated inter-frame prediction to the entire reference image signals stored in the image memory 16 and also updating the contents of the image memory 16 with the predictive image signal; a predictive error detector 11 for detecting a difference between the predictive image signal and an input moving image signal as a predictive error; and an code conversion circuit 15 for encoding the motion parameter and the predictive error. COPYRIGHT: (C)2004,JPO
摘要翻译：要解决的问题：提供一种运动图像编码装置，其在将全局运动补偿应用于整个图像帧以形成预测图像时，不会增加残留编码所需的编码量，而少量省略预测图像。解决方案：该运动图像编码装置具有：全局运动参数检测电路21，用于检测表示整个图像帧的运动的运动参数; 用于将图像帧中的图像信号和图像帧周围的图像信号存储为参考图像信号的图像存储器16; 预测图像生成电路23，用于通过使用运动参数来生成预测图像信号，以将运动补偿帧间预测应用于存储在图像存储器16中的整个参考图像信号，并且还用图像存储器16的内容更新图像存储器16的内容图像信号; 用于检测预测图像信号和输入运动图像信号之间的差作为预测误差的预测误差检测器11; 以及用于对运动参数和预测误差进行编码的代码转换电路15。版权所有（C）2004，JPO

6. 发明专利

JP2003337592A Method and equipment for synthesizing voice, and program for synthesizing voice 审中-公开
标题翻译：用于合成语音的方法和设备，以及用于合成语音的程序
公开(公告)号：JP2003337592A
公开(公告)日：2003-11-28
申请号：JP2002146162
申请日：2002-05-21
申请人： Toshiba Corp , 株式会社東芝
发明人： TOKUDA RYUTARO , KAGOSHIMA TAKEHIKO , HIRABAYASHI TAKESHI
IPC分类号： G10L13/08 , G10L13/06
摘要： PROBLEM TO BE SOLVED: To provide a method and equipment for synthesizing a voice in which various voices having different speech characteristics, including human feeling, a speech style, personality, or the like are flexibly easily synthesized.
SOLUTION: The method and the equipment for synthesizing the voice comprise: obtaining a first parameter indicative of the rhythmical feature of the voice having standard speech characteristics on the basis of language information obtained by analyzing a text; obtaining a second parameter for correcting the first parameter to indicate the rhythmical feature corresponding to the specified speech characteristics on the basis of the language information and the specified speech characteristics; superimposing at least the first parameter on the second parameter for every rhythm control unit as the unit of the speech for controlling the rhythmical feature of the voice to generate a third parameter indicative of the rhythmical feature corresponding to the specified speech characteristics; and generating the synthesized voice corresponding to the specified speech characteristics on the basis of the third parameter.
COPYRIGHT: (C)2004,JPO
摘要翻译：要解决的问题：提供一种用于合成语音的方法和设备，其中灵活地容易地合成包括人的感觉，言语风格，个性等的具有不同语音特征的各种语音。解决方案：用于合成声音的方法和设备包括：基于通过分析文本获得的语言信息，获得指示具有标准语音特征的语音的节奏特征的第一参数; 基于语言信息和指定的语音特征，获得用于校正第一参数以指示与指定语音特征相对应的节奏特征的第二参数; 将用于每个节奏控制单元的至少第一参数叠加在第二参数上作为用于控制语音的节奏特征的语音单元，以生成指示对应于指定语音特征的节奏特征的第三参数; 以及基于所述第三参数生成与所述指定语音特征相对应的合成语音。版权所有（C）2004，JPO

7. 发明专利

JP2003330482A Method, device, and program for generating fundamental frequency pattern and method, device and program for synthesizing voice 审中-公开
标题翻译：用于产生基本频率图案的方法，设备和程序以及用于合成语音的方法，设备和程序
公开(公告)号：JP2003330482A
公开(公告)日：2003-11-19
申请号：JP2002138944
申请日：2002-05-14
申请人： Toshiba Corp , 株式会社東芝
发明人： KAGOSHIMA TAKEHIKO , HIRABAYASHI TAKESHI , TOKUDA RYUTARO
IPC分类号： G10L13/08 , G10L11/04 , G10L13/00 , G10L13/04
摘要： PROBLEM TO BE SOLVED: To provide a method and a device for generating a fundamental frequency pattern which generate a fundamental frequency pattern of a voice approximating that of a voice uttered by a person and to provide a method and a system for synthesizing a voice, which use the same.
SOLUTION: A first fundamental frequency pattern and a second fundamental frequency pattern which correspond to a text are selected from a first storage means wherein a plurality of first fundamental frequency patterns for each first rhythm control unit being a unit of voice for controlling rhythmical features of a voice corresponding to the text are stored and a second storage means wherein a plurality of second fundamental frequency patterns for each Second rhythm control unit being a unit of voice, which has a time length shorter than that of the first rhythm control unit, on the basis of language information obtained from the text, and a fundamental frequency pattern of the voice corresponding to the text is generated on the basis of at least the selected first and second fundamental frequency patterns.
COPYRIGHT: (C)2004,JPO
摘要翻译：解决的问题：提供一种用于生成基本频率图案的方法和装置，该基本频率图案生成近似由人发出的语音的基本频率图案，并提供一种用于合成声音，使用相同。解决方案：从第一存储装置中选择对应于文本的第一基频模式和第二基频模式，其中，用于每个第一节奏控制单元的多个第一基频模式是用于控制节奏的声音单元存储对应于文本的声音的特征和第二存储装置，其中用于每个第二节奏控制单元的多个第二基本频率图案是语音单元，其具有比第一节奏控制单元短的时间长度，基于从文本获得的语言信息，并且基于至少所选择的第一和第二基本频率模式生成与文本相对应的语音的基本频率模式。版权所有（C）2004，JPO

8. 发明专利

JP2011027852A Accent information-extracting device, accent information-extracting method and accent information-extracting program 有权
标题翻译：有效的信息提取设备，有效的信息提取方法和实际信息提取程序
公开(公告)号：JP2011027852A
公开(公告)日：2011-02-10
申请号：JP2009171473
申请日：2009-07-22
申请人： Toshiba Corp , 株式会社東芝
发明人： TACHIBANA KENTARO , HIRABAYASHI TAKESHI , KAGOSHIMA TAKEHIKO
IPC分类号： G10L13/08 , G10L11/04
摘要： PROBLEM TO BE SOLVED: To provide an accent information-extracting device which accurately determines an accent type of input voice.
SOLUTION: An F0 variation pattern which is a variation pattern of a basic frequency from input voice is extracted; and mora synchronization information which is time information synchronized with each mora of the input voice is input. Next, a mora representative value is calculated for each mora of the F0 variation pattern, and a mora variation amount which is a variation amount of a mora representative value adjoining thereafter, with reference to the mora representative value, is respectively calculated. Then, a mora in which the mora variation value is the smallest negative value is detected, and when a variation amount minimum value which is a mora variation amount by that mora is larger than a first threshold for determining an accent 0 type, it is determined to be the 0 type, and when it is smaller than the first threshold, a mora variation amount of a mora before the mora with the variation amount minimum value is continuously searched, and the accent type is determined by detecting a foremost mora which is smaller than a second threshold for determining that the mora variation amount is other than the accent 0 type.
COPYRIGHT: (C)2011,JPO&INPIT
摘要翻译：要解决的问题：提供一种准确地确定输入语音的重音类型的重音信息提取装置。提取作为输入声音的基本频率的变化模式的F0变化模式; 并输入与输入声音的每个动作同步的时间信息的mora同步信息。接下来，分别计算出F0变化模式的每个mora的mora代表值，并且分别参照mora代表值分别作为与之相邻的mora代表值的变化量的mora变化量。然后，检测出mora变化值为最小负值的mora，并且当作为该mora的mora变化量的变化量最小值大于用于确定重音0类型的第一阈值时，确定为0类，当小于第一阈值时，连续地搜索具有变化量最小值的mora之前的mora的mora变化量，并且通过检测最小的mora来确定重音类型比用于确定mora变化量不是重音0类型的第二阈值。版权所有（C）2011，JPO＆INPIT

9. 发明专利

JP2010230704A Speech processing device, method, and program 有权
标题翻译：语音处理设备，方法和程序
公开(公告)号：JP2010230704A
公开(公告)日：2010-10-14
申请号：JP2009074957
申请日：2009-03-25
申请人： Toshiba Corp , 株式会社東芝
发明人： MORITA SHINKO , KAGOSHIMA TAKEHIKO
IPC分类号： G10L13/06
摘要： PROBLEM TO BE SOLVED: To provide a speech processing device, a method, and a program which integrate an elementary speech unit without breaking features of a sound source and a vocal tract filter inherent in a speech waveform.
SOLUTION: A phoneme-metric input reception section 41 receives inputs of a plurality of segments obtained by dividing phoneme series corresponding to target voice by a synthetic unit and metrical information corresponding to each of the plurality of segments. An acquisition section 43 acquires a plurality of elementary speech units related with the segment and the metrical information for each of the plurality of segments. A vocal tract filter component integration section 45 integrates a vocal tract filter component of the plurality of acquired elementary speech units for every segment. A sound source component integration section 46 expands and contracts a sound source component of a periodic component of the plurality of acquired elementary speech units based on a fundamental frequency or a shape of a waveform of the sound source component and integrates them for every segment. An elementary unit integration section 44 integrates the plurality of acquired elementary speech units for every segment by filtering the integrated sound source component using the vocal tract filter.
COPYRIGHT: (C)2011,JPO&INPIT
摘要翻译：要解决的问题：提供一种语音处理装置，方法和程序，其集成了基本语音单元，而不会破坏语音波形中固有的声源和声道滤波器的特征。解决方案：音素度量输入接收部分41接收通过用合成单位划分与目标语音相对应的音素序列获得的多个片段的输入和对应于多个片段中的每一个的度量信息。获取部43获取与该段相关的多个基本语音单元和多个段中的每一个的测量信息。声道滤波器部件集成部件45对于每个片段集成了多个获取的基本语音单元的声道滤波器部件。声源分量积分部46基于声源分量的波形的基本频率或形状来扩展和收缩多个获取的基本语音单元的周期分量的声源分量，并对它们进行积分。基本单元积分部分44通过使用声道滤波器对集成声源分量进行滤波来对每个分段对多个获取的基本语音单元进行积分。版权所有（C）2011，JPO＆INPIT

10. 发明专利

JP2010190955A Voice synthesizer, method, and program 审中-公开
标题翻译：语音合成器，方法和程序
公开(公告)号：JP2010190955A
公开(公告)日：2010-09-02
申请号：JP2009032541
申请日：2009-02-16
申请人： Toshiba Corp , 株式会社東芝
发明人： TOKUDA RYUTARO , KAGOSHIMA TAKEHIKO
IPC分类号： G10L13/08 , G01D7/12 , G06F3/16 , G10L13/00 , G10L13/02
CPC分类号： G10L13/00
摘要： PROBLEM TO BE SOLVED: To provide a voice-outputting technique which enables a user to timely grasp a measured value, even if the measured value drastically changes as time passes, in a voice synthesizer which outputs, by voice, measured values which change as time passes. SOLUTION: A change-of-numerical-value detecting part 102 detects the periodical changes of measured values represented by numerical value data, the input of which a numerical value data inputting part 101 receives. A text-generating part 103 generates a text which represents a numerical value having; a digit of a numerical value, which is detected by the change-of-numerical-value detecting part 102 to change among the measured values; and a lower-rank digit to the digit. A synthetic voice-generating part 104 generates synthetic voice data which represent, by voice, the numerical value represented by the text generated by the text-generating part 103. A synthetic voice-outputting part 105 outputs, via a speaker, the voice represented by the synthetic voice data generated by the synthetic voice-generating part 104. COPYRIGHT: (C)2010,JPO&INPIT
摘要翻译：要解决的问题：为了提供语音输出技术，即使在经过测量值随时间而急剧变化的情况下，用户能够及时掌握测量值，在语音合成器中，通过语音输出测量值，随着时间的流逝而改变。解决方案：数值变更检测部件102检测由数值数据输入部件101输入的数值数据表示的测量值的周期性变化。文本生成部103生成表示具有的数值的文本。由数值变化检测部102检测出的数值的数字，使其在测定值之间变化; 和数字的低位数字。合成语音生成部104生成通过语音表示由文本生成部103生成的文本所表示的数值的合成语音数据。合成声音输出部105经由扬声器输出由由合成语音产生部分104产生的合成语音数据。版权所有（C）2010，JPO＆INPIT

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式