会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Defining atom units between phone and syllable for TTS systems
    • 为TTS系统定义手机和音节之间的原子单位
    • US07418389B2
    • 2008-08-26
    • US11033075
    • 2005-01-11
    • Min ChuYong Zhao
    • Min ChuYong Zhao
    • G10L13/06G10L13/00
    • G10L13/08
    • A method for identifying common multiphone units to add to a unit inventory for a text-to-speech generator is disclosed. The common multiphone units are units that are larger than a phone, but smaller than a syllable. The method slices each syllable into a plurality of slices. These slices are then sorted and the frequency of each slice is determined. Those slices whose frequencies exceed a threshold are added to the unit inventory. The remaining slices are decomposed according to a predetermined set of rules to determine if they contain slices that should be added to the unit inventory.
    • 公开了一种用于识别用于添加到文本到语音生成器的单元库存的公共多声单元的方法。 普通的多声道单元是比手机大的单位,但小于音节。 该方法将每个音节分成多个切片。 然后对这些切片进行排序,并确定每个切片的频率。 频率超过阈值的那些切片被添加到单位库存中。 剩余的切片根据预定的一组规则分解,以确定它们是否包含应该添加到单元库存的切片。
    • 6. 发明申请
    • Unnatural prosody detection in speech synthesis
    • 语言合成中的非自然韵律检测
    • US20090083036A1
    • 2009-03-26
    • US11903020
    • 2007-09-20
    • Yong ZhaoFrank Kao-ping SoongMin ChuLijuan Wang
    • Yong ZhaoFrank Kao-ping SoongMin ChuLijuan Wang
    • G10L13/08G06F17/30
    • G10L13/10
    • Described is a technology by which synthesized speech generated from text is evaluated against a prosody model (trained offline) to determine whether the speech will sound unnatural. If so, the speech is regenerated with modified data. The evaluation and regeneration may be iterative until deemed natural sounding. For example, text is built into a lattice that is then (e.g., Viterbi) searched to find a best path. The sections (e.g., units) of data on the path are evaluated via a prosody model. If the evaluation deems a section to correspond to unnatural prosody, that section is replaced, e.g., by modifying/pruning the lattice and re-performing the search. Replacement may be iterative until all sections pass the evaluation. Unnatural prosody detection may be biased such that during evaluation, unnatural prosody is falsely detected at a higher rate relative to a rate at which unnatural prosody is missed.
    • 描述了一种技术,通过该技术,从文本产生的合成语音针对韵律模型(离线训练)进行评估,以确定语音是否会听起来不自然。 如果是,则使用修改的数据重新生成语音。 评估和再生可能是迭代的,直到被认为是自然的声音。 例如,文本被内置到一个格子中,然后(例如,维特比)被搜索以找到最佳路径。 通过韵律模型评估路径上的数据的部分(例如,单位)。 如果评估认为一部分对应于非自然韵律,则该部分被替换,例如通过修改/修剪格子并重新执行搜索。 替换可能是迭代的,直到所有部分通过评估。 不自然的韵律检测可能有偏差,使得在评估期间,相对于错过非自然韵律的速率,以较高的速率错误地检测到非自然韵律。
    • 7. 发明授权
    • Providing personalized voice font for text-to-speech applications
    • 为文字到语音应用程序提供个性化的语音字体
    • US07693719B2
    • 2010-04-06
    • US10977178
    • 2004-10-29
    • Min ChuYong ZhaoSheng Zhao
    • Min ChuYong ZhaoSheng Zhao
    • G10L21/00G10L13/00G06F3/16
    • G10L13/033G10L2021/0135
    • A method for synthesizing speech from text includes receiving one or more waveforms characteristic of a voice of a person selected by a user, generating a personalized voice font based on the one or more waveforms, and delivering the personalized voice font to the user's computer, whereby speech can be synthesized from text, the speech being in the voice of the selected person, the speech being synthesized using the personalized voice font. A system includes a text-to-speech (TTS) application operable to generate a voice font based on speech waveforms transmitted from a client computer remotely accessing the TTS application.
    • 一种用于从文本合成语音的方法包括接收用户选择的人物的声音特征的一个或多个波形,基于一个或多个波形产生个性化语音字体,并将个性化语音字体传送到用户的计算机,由此 可以从文本合成语音,语音在所选择的人的语音中,使用个性化语音字体合成语音。 一种系统包括文本到语音(TTS)应用,其可操作以基于远程访问TTS应用的客户端计算机发送的语音波形来生成语音字体。
    • 9. 发明授权
    • Optimization of an objective measure for estimating mean opinion score of synthesized speech
    • 优化综合语音平均意见得分的客观量度
    • US07386451B2
    • 2008-06-10
    • US10660388
    • 2003-09-11
    • Min ChuHu PengYong Zhao
    • Min ChuHu PengYong Zhao
    • G10L13/08G10L13/00
    • G10L25/69G10L13/00
    • A method is provided for optimizing an objective measure used to estimate mean opinion score or naturalness of synthesized speech from a speech synthesizer. The method includes using an objective measure that has components derived directly from textual information used to form synthesized utterances. The objective measure has a high correlation with mean opinion score such that a relationship can be formed between the objective measure and corresponding mean opinion score. The objective measure is altered to provide a different function of textual information derived from the utterances so as to improve the relationship between the scores of the objective measure and subjective ratings of the synthesized utterances.
    • 提供了一种用于优化用于估计来自语音合成器的合成语音的平均意见分数或自然度的客观测量的方法。 该方法包括使用具有直接从用于形成合成话语的文本信息导出的成分的客观度量。 客观量度与平均意见分数具有很高的相关性,从而可以在客观量度和相应的平均意见得分之间形成关系。 改变客观量度以提供从话语中得出的文本信息的不同功能,以改善客观测量的分数与合成话语的主观评级之间的关系。