专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

31. 发明申请

US20090070115A1 SPEECH SYNTHESIS SYSTEM, SPEECH SYNTHESIS PROGRAM PRODUCT, AND SPEECH SYNTHESIS METHOD 有权
标题翻译：语音合成系统，语音合成程序产品和语音合成方法
公开(公告)号：US20090070115A1
公开(公告)日：2009-03-12
申请号：US12192510
申请日：2008-08-15
申请人： Ryuki Tachibana , Masafumi Nishimura
发明人： Ryuki Tachibana , Masafumi Nishimura
IPC分类号： G10L13/08
CPC分类号： G10L13/00 , G10L13/07 , G10L13/10
摘要： It is an objective of the present invention to provide waveform concatenation speech synthesis with high sound quality utilizing its advantages in the case where there is a large quantity of speech segments while providing waveform concatenation speech synthesis with accurate accents in other cases. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. In the preferred embodiment of the present invention, an accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.
摘要翻译：本发明的目的是提供具有高音质的波形级联语音合成，利用其在存在大量语音段的情况下的优点，同时在其它情况下提供具有精确重音的波形级联语音合成。通过执行包括语音片段搜索和韵律修改值搜索的双向搜索来实现高精度和高音质的韵律。在本发明的优选实施例中，通过使用语音段选择的两个路径中的韵律变化（基频的斜率）和修改值的统计模型来评估韵律的一致性来确保准确的重音搜索。在韵律修改值搜索中，搜索最小化修改的韵律成本的韵律修改值序列。这允许搜索修改值序列，其可以使用最小修改值尽可能高地增加对统计模型的韵律的绝对值或变化的可能性。

32. 发明授权

US07480612B2 Word predicting method, voice recognition method, and voice recognition apparatus and program using the same methods 有权
标题翻译：词预测方法，语音识别方法，语音识别装置和程序使用相同的方法
公开(公告)号：US07480612B2
公开(公告)日：2009-01-20
申请号：US10226564
申请日：2002-08-22
申请人： Shinsuke Mori , Masafumi Nishimura , Nobuyasu Itoh
发明人： Shinsuke Mori , Masafumi Nishimura , Nobuyasu Itoh
IPC分类号： G06F17/20 , G10L15/00
CPC分类号： G10L15/193
摘要： A word predicting method for use with a voice recognition using a computer includes the steps of specifying a sentence structure of a history up to a word immediately before the word to be predicted, referring to a context tree stored in arboreal context tree storage section having information about possible structures of a sentence and a probability of appearance of words with respect to the structures at nodes, and predicting words based on the context tree and the specified sentence structure of the history.
摘要翻译：使用计算机使用语音识别的词预测方法包括以下步骤：参考存储在具有信息的树木上下文树存储部分中的上下文树，来指定直到紧接着要预测的单词之前的单词的历史的句子结构关于句子的可能结构和相对于节点处的结构出现词汇的概率，以及基于上下文树和历史的指定句子结构来预测单词。

33. 发明申请

US20080221872A1 METHODS AND APPARATUS FOR NATURAL SPOKEN LANGUAGE SPEECH RECOGNITION 有权
标题翻译：自然语言语音识别的方法和装置
公开(公告)号：US20080221872A1
公开(公告)日：2008-09-11
申请号：US12045198
申请日：2008-03-10
申请人： Shinsuke Mori , Masafumi Nishimura , Nobuyasu Itoh
发明人： Shinsuke Mori , Masafumi Nishimura , Nobuyasu Itoh
IPC分类号： G06F17/27
CPC分类号： G10L15/19
摘要： A word prediction method and apparatus improves precision and accuracy. For the prediction of a sixth word “?”, a partial analysis tree having a modification relationship with the sixth word is predicted. “sara-ni sho-senkyoku no” has two partial analysis trees, “sara-ni” and “sho-senkyoku no”. It is predicted that “sara-ni” does not have a modification relationship with the sixth word, and that “sho-senkyoku no” does. Then, “donyu”, which is the sixth word from “sho-senkyoku no”, is predicted. In this example, since “sara-ni” is not useful information for the prediction of “donyu”, it is preferable that “donyu” be predicted only by “sho-senkyoku no”.
摘要翻译：词预测方法和装置提高了精度和精度。为了预测第六个字“？”，预测了与第六个字有修正关系的部分分析树。 “sara-ni sho-senkyoku no”有两个部分分析树，“sara-ni”和“sho-senkyoku no”。据预测，“sara-ni”与第六个字没有修改关系，“sho-senkyoku no”也没有。那么，这是“sho-senkyoku no”的第六个单词“donyu”。在这个例子中，由于“sara-ni”对于“donyu”的预测没有用的信息，因此优选仅通过“sho-senkyoku no”来预测“donyu”。

34. 发明申请

US20080046247A1 System And Method For Supporting Text-To-Speech 有权
标题翻译：支持文字转语音的系统和方法
公开(公告)号：US20080046247A1
公开(公告)日：2008-02-21
申请号：US11774798
申请日：2007-07-09
申请人： Gakuto Kurata , Toru Nagano , Masafumi Nishimura , Ryuki Tachibana
发明人： Gakuto Kurata , Toru Nagano , Masafumi Nishimura , Ryuki Tachibana
IPC分类号： G10L13/00
CPC分类号： G10L13/04 , G10L15/26
摘要： A system for generating high-quality synthesized text-to-speech includes a learning data generating unit, a frequency data generating unit, and a setting unit. The learning data generating unit recognizes inputted speech, and then generates first learning data in which wordings of phrases are associated with readings thereof. The frequency data generating unit generates, based on the first learning data, frequency data indicating appearance frequencies of both wordings and readings of phrases. The setting unit sets the thus generated frequency data for a language processing unit in order to approximate outputted speech of text-to-speech to the inputted speech. Furthermore, the language processing unit generates, from a wording of text, a reading corresponding to the wording, on the basis of the appearance frequencies.
摘要翻译：用于产生高质量合成文本到语音的系统包括学习数据生成单元，频率数据生成单元和设置单元。学习数据生成单元识别输入的语音，然后生成其中短语的词语与其读数相关联的第一学习数据。频率数据生成单元基于第一学习数据生成指示短语的两个措辞和读数的出现频率的频率数据。设置单元设置由此产生的语言处理单元的频率数据，以将文本到语音的输出语音与输入的语音近似。此外，语言处理单元根据出现频率，从文字的文字生成与该文字对应的阅读。

35. 发明授权

US08918396B2 Information processing apparatus, method and program for determining weight of each feature in subjective hierarchical clustering 有权
标题翻译：用于确定主观层次聚类中每个特征的权重的信息处理设备，方法和程序
公开(公告)号：US08918396B2
公开(公告)日：2014-12-23
申请号：US13535904
申请日：2012-06-28
申请人： Toru Nagano , Masafumi Nishimura , Takashima Ryoichi , Ryuki Tachibana
发明人： Toru Nagano , Masafumi Nishimura , Takashima Ryoichi , Ryuki Tachibana
IPC分类号： G06F17/30 , G06N99/00 , G10L17/26
CPC分类号： G06N99/005 , G06F17/3002 , G10L17/26
摘要： An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.
摘要翻译：一种信息处理装置，通过以下三种方式确定用于层次聚类的各种物理特征的权重，其中，具有指示用户指定的对的标签信息在三元组的三个内容中具有最高相似度的训练数据，并执行使用训练数据的每条内容的特征向量和每个特征的权重进行分级聚类，以确定训练数据的分层结构。信息处理装置更新每个特征的权重，使得在确定的分层结构中的三重组的三个内容中的第一组合与第三组之间的一致的组合和由对应于三重组的标签信息指示的对之间的一致性程度增加。

36. 发明申请

US20130006991A1 INFORMATION PROCESSING APPARATUS, METHOD AND PROGRAM FOR DETERMINING WEIGHT OF EACH FEATURE IN SUBJECTIVE HIERARCHICAL CLUSTERING 有权
标题翻译：信息处理装置，用于确定主要层次分类中每个特征的权重的方法和程序
公开(公告)号：US20130006991A1
公开(公告)日：2013-01-03
申请号：US13535904
申请日：2012-06-28
申请人： Toru Nagano , Masafumi Nishimura , Takashima Ryoichi , Ryuki Tachibana
发明人： Toru Nagano , Masafumi Nishimura , Takashima Ryoichi , Ryuki Tachibana
IPC分类号： G06F17/30
CPC分类号： G06N99/005 , G06F17/3002 , G10L17/26
摘要： An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.
摘要翻译：一种信息处理装置，通过以下三种方式确定用于层次聚类的各种物理特征的权重，其中，具有指示用户指定的对的标签信息在三元组的三个内容中具有最高相似度的训练数据，并执行使用训练数据的每条内容的特征向量和每个特征的权重进行分级聚类，以确定训练数据的分层结构。信息处理装置更新每个特征的权重，使得在确定的分层结构中的三重组的三个内容中的第一组合与第三组之间的一致的组合和由对应于三重组的标签信息指示的对之间的一致性程度增加。

37. 发明授权

US08150687B2 Recognizing speech, and processing data 有权
标题翻译：识别语音和处理数据
公开(公告)号：US08150687B2
公开(公告)日：2012-04-03
申请号：US11000165
申请日：2004-11-30
申请人： Shinsuke Mori , Nobuyasu Itoh , Masafumi Nishimura
发明人： Shinsuke Mori , Nobuyasu Itoh , Masafumi Nishimura
IPC分类号： G06F17/27 , G10L15/00 , G10L15/16
CPC分类号： G10L15/26
摘要： An example embodiment of the invention includes a speech recognition processing unit for specifying speech segments for speech data, recognizing a speech in each of the speech segments, and associating a character string of obtained recognition data with the speech data for each speech segment, based on information on a time of the speech, and an output control unit for displaying/outputting the text prepared by sorting the recognition data in each speech segment. Sometimes, the system further includes a text editing unit for editing the prepared text, and a speech correspondence estimation unit for associating a character string in the edited text with the speech data by using a technique of dynamic programming.
摘要翻译：本发明的示例性实施例包括：语音识别处理单元，用于指定用于语音数据的语音片段，识别每个语音片段中的语音，并且将所获得的识别数据的字符串与每个语音段的语音数据相关联，基于关于语音时间的信息，以及输出控制单元，用于显示/输出通过对每个语音段中的识别数据进行排序而准备的文本。有时，该系统还包括用于编辑准备的文本的文本编辑单元和用于通过使用动态规划技术将编辑文本中的字符串与语音数据相关联的语音对应估计单元。

38. 发明授权

US08024184B2 Speech recognition device, speech recognition method, computer-executable program for causing computer to execute recognition method, and storage medium 有权
标题翻译：语音识别装置，语音识别方法，用于使计算机执行识别方法的计算机可执行程序和存储介质
公开(公告)号：US08024184B2
公开(公告)日：2011-09-20
申请号：US12476650
申请日：2009-06-02
申请人： Tetsuya Takiguchi , Masafumi Nishimura
发明人： Tetsuya Takiguchi , Masafumi Nishimura
IPC分类号： G19L21/02
CPC分类号： G10L15/20 , G10L2021/02082
摘要： A speech recognition device and method configured to include a computer, for recognizing speech, including: a storage location for storing a feature quantity acquired from a speech signal for each frame; storage portions for storing acoustic model data and language model data; a echo speech component for generating echo speech model data from a speech signal acquired prior to a speech signal to be processed at the current time point and using the echo speech model data to generate adapted acoustic model data; and a processing component for utilizing the feature quantity, the adapted acoustic model data, and the language model data to provide a speech recognition result of the speech signal.
摘要翻译：一种语音识别装置和方法，被配置为包括用于识别语音的计算机，包括：用于存储从每个帧的语音信号获取的特征量的存储位置; 用于存储声学模型数据和语言模型数据的存储部分; 用于从在当前时间点处理的语音信号之前获取的语音信号产生回声语音模型数据并使用回波语音模型数据生成适应的声学模型数据的回波语音分量; 以及用于利用特征量，适应的声学模型数据和语言模型数据的处理组件，以提供语音信号的语音识别结果。

39. 发明授权

US07921014B2 System and method for supporting text-to-speech 有权
标题翻译：支持文字转语音的系统和方法
公开(公告)号：US07921014B2
公开(公告)日：2011-04-05
申请号：US11774798
申请日：2007-07-09
申请人： Gakuto Kurata , Toru Nagano , Masafumi Nishimura , Ryuki Tachibana
发明人： Gakuto Kurata , Toru Nagano , Masafumi Nishimura , Ryuki Tachibana
IPC分类号： G10L13/00
CPC分类号： G10L13/04 , G10L15/26
摘要： A system for generating high-quality synthesized text-to-speech includes a learning data generating unit, a frequency data generating unit, and a setting unit. The learning data generating unit recognizes inputted speech, and then generates first learning data in which wordings of phrases are associated with readings thereof. The frequency data generating unit generates, based on the first learning data, frequency data indicating appearance frequencies of both wordings and readings of phrases. The setting unit sets the thus generated frequency data for a language processing unit in order to approximate outputted speech of text-to-speech to the inputted speech. Furthermore, the language processing unit generates, from a wording of text, a reading corresponding to the wording, on the basis of the appearance frequencies.
摘要翻译：用于产生高质量合成文本到语音的系统包括学习数据生成单元，频率数据生成单元和设置单元。学习数据生成单元识别输入的语音，然后生成其中短语的词语与其读数相关联的第一学习数据。频率数据生成单元基于第一学习数据生成指示短语的两个措辞和读数的出现频率的频率数据。设置单元设置由此产生的语言处理单元的频率数据，以将文本到语音的输出语音与输入的语音近似。此外，语言处理单元根据出现频率，从文字的文字生成与该文字对应的阅读。

40. 发明授权

US07660717B2 Speech recognition system and program thereof 有权
标题翻译：语音识别系统及其程序
公开(公告)号：US07660717B2
公开(公告)日：2010-02-09
申请号：US11971651
申请日：2008-01-09
申请人： Tetsuya Takiguchi , Masafumi Nishimura
发明人： Tetsuya Takiguchi , Masafumi Nishimura
IPC分类号： G10L15/14 , G10L15/00 , G10L15/06
CPC分类号： G10L15/30 , G10L15/02 , G10L15/20
摘要： Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.
摘要翻译：通过使用复合HMM，通过对输入语音的每个语音帧合成语音HMM（隐马尔可夫模型）和噪声HMM而获得的输入语音的特征量和复合HMM之间的匹配来执行语音识别。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式