专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

WO2021257177A1 SPONTANEOUS TEXT TO SPEECH (TTS) SYNTHESIS 审中-公开
公开(公告)号：WO2021257177A1
公开(公告)日：2021-12-23
申请号：PCT/US2021/028516
申请日：2021-04-22
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： ZHANG, Ran , LUAN, Jian , CONG, Yahuan
IPC分类号： G10L13/10
摘要： The present disclosure provides methods and apparatuses for spontaneous text-to-speech (TTS) synthesis. A target text may be obtained. A fluency reference factor may be determined based at least on the target text. An acoustic feature corresponding to the target text may be generated with the fluency reference factor. A speech waveform corresponding to the target text may be generated based on the acoustic feature.

2. 发明申请

WO2020190395A1 PROVIDING EMOTION MANAGEMENT ASSISTANCE 审中-公开
公开(公告)号：WO2020190395A1
公开(公告)日：2020-09-24
申请号：PCT/US2020/016303
申请日：2020-02-03
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： XIU, Chi , LUAN, Jian
IPC分类号： G11B27/031 , G11B27/28 , G10L25/63
摘要： A method for providing emotion management assistance is provided. Sound streams may be received. A speech conversation between a user and at least one conversation object may be detected from the sound streams. Identity of the conversation object may be identified at least according to speech of the conversation object in the speech conversation. Emotion state of at least one speech segment of the user in the speech conversation may be determined. An emotion record corresponding to the speech conversation may be generated, wherein the emotion record at least including the identity of the conversation object, at least a portion of content of the speech conversation, and the emotion state of the at least one speech segment of the user.

3. 发明申请

WO2019200584A1 GENERATING RESPONSE IN CONVERSATION 审中-公开
公开(公告)号：WO2019200584A1
公开(公告)日：2019-10-24
申请号：PCT/CN2018/083735
申请日：2018-04-19
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC , LUAN, Jian , XIAO, Zhe , NA, Xingyu , XIU, Chi , JU, Jianzhong , XU, Xiang
发明人： LUAN, Jian , XIAO, Zhe , NA, Xingyu , XIU, Chi , JU, Jianzhong , XU, Xiang
IPC分类号： G10L25/63 , G10L15/22
摘要： The present disclosure provides method and apparatus for generating a response in a human-machine conversation. A first sound input may be received in the conversation. A first audio attribute may be extracted from the first sound input, wherein the first audio attribute indicates a first condition of a user. A second sound input may be received in the conversation. A second audio attribute may be extracted from the second sound input, wherein the second audio attribute indicates a second condition of a user. A difference between the second audio attribute and the first audio attribute is determined, wherein the difference indicates a condition change of the user from the first condition to the second condition. A response to the second sound input is generated based at least on the condition change.

4. 发明申请

WO2021021305A1 OBTAINING A SINGING VOICE DETECTION MODEL 审中-公开
公开(公告)号：WO2021021305A1
公开(公告)日：2021-02-04
申请号：PCT/US2020/036869
申请日：2020-06-10
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： HOU, Yuanbo , LUAN, Jian , SOONG, Kao-Ping
IPC分类号： G10H1/00 , G06N3/02 , G10L25/30 , G10L25/51
摘要： The present disclosure provides methods and apparatuses for obtaining a singing voice detection model. A plurality of speech clips and a plurality of instrumental music clips may be synthesized into a plurality of audio clips. A speech detection model may be trained with the plurality of audio clips. At least a part of the speech detection model may be transferred to a singing voice detection model. The singing voice detection model may be trained with a set of polyphonic music clips.

5. 发明申请

WO2018090356A1 AUTOMATIC DUBBING METHOD AND APPARATUS 审中-公开
公开(公告)号：WO2018090356A1
公开(公告)日：2018-05-24
申请号：PCT/CN2016/106554
申请日：2016-11-21
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC , GABRYJELSKI, Henry , LUAN, Jian , LI, Dapeng
发明人： GABRYJELSKI, Henry , LUAN, Jian , LI, Dapeng
IPC分类号： G10L13/04
摘要： An automatic dubbing method is disclosed. The method comprises: extracting speeches of a voice from an audio portion of a media content (504); obtaining a voice print model for the extracted speeches of the voice (506); processing the extracted speeches by utilizing the voice print model to generate replacement speeches (508); and replacing the extracted speeches of the voice with the generated replacement speeches in the audio portion of the media content (510).

6. 发明申请

WO2017218243A3 INTENT RECOGNITION AND EMOTIONAL TEXT-TO-SPEECH LEARNING SYSTEM 审中-公开
公开(公告)号：WO2017218243A3
公开(公告)日：2017-12-21
申请号：PCT/US2017/036241
申请日：2017-06-07
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： ZHAO, Pei , YAO, Kaisheng , LEUNG, Max , YAN, Bo , LUAN, Jian , SHI, Yu , MA, Malone , HWANG, Mei-Yuh
IPC分类号： G10L25/63 , G06F17/27 , G10L15/26
摘要： An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

7. 发明申请

WO2015130581A1 VOICE FONT SPEAKER AND PROSODY INTERPOLATION 审中-公开
标题翻译：声音扬声器和前置插值
公开(公告)号：WO2015130581A1
公开(公告)日：2015-09-03
申请号：PCT/US2015/017002
申请日：2015-02-23
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： LUAN, Jian , HE, Lei , LEUNG, Max
IPC分类号： G10L13/08 , G10L13/033
CPC分类号： G10L13/0335 , G06F3/0482 , G06F3/04847 , G10L13/02 , G10L13/033 , G10L13/08
摘要： Multi-voice font interpolation is provided. A multi-voice font interpolation engine allows the production of computer generated speech with a wide variety of speaker characteristics and/or prosody by interpolating speaker characteristics and prosody from existing fonts. Using prediction models from multiple voice fonts, the multi-voice font interpolation engine predicts values for the parameters that influence speaker characteristics and/or prosody for the phoneme sequence obtained from the text to spoken. For each parameter, additional parameter values are generated by a weighted interpolation from the predicted values. Modifying an existing voice font with the interpolated parameters changes the style and/or emotion of the speech while retaining the base sound qualities of the original voice. The multi-voice font interpolation engine allows the speaker characteristics and/or prosody to be transplanted from one voice font to another or entirely new speaker characteristics and/or prosody to be generated for an existing voice font.
摘要翻译：提供多语音字体插补。多语音字体插入引擎允许通过从现有字体插入扬声器特征和韵律来生成具有各种扬声器特征和/或韵律的计算机生成语音。使用多个语音字体的预测模型，多语音字体插值引擎预测影响说话者特征的参数的值和/或从要发音的文本获得的音素序列的韵律。对于每个参数，通过来自预测值的加权内插生成附加参数值。使用内插参数修改现有的语音字体会改变语音的风格和/或情绪，同时保留原始语音的基本声音质量。多语音字体插入引擎允许将扬声器特征和/或韵律从一种语音字体移植到另一种或全新的扬声器特征和/或为现有语音字体生成的韵律。

8. 发明申请

WO2021242366A1 PHRASE-BASED END-TO-END TEXT-TO-SPEECH (TTS) SYNTHESIS 审中-公开
公开(公告)号：WO2021242366A1
公开(公告)日：2021-12-02
申请号：PCT/US2021/023054
申请日：2021-03-19
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： ZHANG, Ran , LUAN, Jian , CONG, Yahuan
IPC分类号： G10L13/047 , G10L13/08 , G10L13/10 , G10L13/02 , G06N3/04
摘要： The present disclosure provides methods and apparatuses for phrase-based end-to-end text-to-speech (TTS) synthesis. A text may be obtained. A target phrase in the text may be identified. A phrase context of the target phrase may be determined. An acoustic feature corresponding to the target phrase may be generated based at least on the target phrase and the phrase context. A speech waveform corresponding to the target phrase may be generated based on the acoustic feature.

9. 发明申请

WO2016040209A1 TEXT-TO-SPEECH WITH EMOTIONAL CONTENT 审中-公开
标题翻译：具有情感内容的文字与语音
公开(公告)号：WO2016040209A1
公开(公告)日：2016-03-17
申请号：PCT/US2015/048755
申请日：2015-09-07
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： LUAN, Jian , HE, Lei , LEUNG, Max
IPC分类号： G10L13/033
CPC分类号： G10L13/027 , G10L13/033
摘要： Techniques for converting text to speech having emotional content. In an aspect, an emotionally neutral acoustic trajectory is predicted for a script using a neutral model, and an emotion-specific acoustic trajectory adjustment is independently predicted using an emotion-specific model. The neutral trajectory and emotion-specific adjustments are combined to generate a transformed speech output having emotional content. In another aspect, state parameters of a statistical parametric model for neutral voice are transformed by emotion-specific factors that vary across contexts and states. The emotion-dependent adjustment factors may be clustered and stored using an emotion-specific decision tree or other clustering scheme distinct from a decision tree used for the neutral voice model.
摘要翻译：用于将文本转换为具有情感内容的语言的技术。在一方面，对于使用中性模型的脚本预测情绪中立的声轨迹，并且使用情感特定模型独立地预测情绪特异性声轨迹调整。组合中性轨迹和情感特定调整以产生具有情感内容的变换语音输出。在另一方面，用于中立语音的统计参数模型的状态参数由在情境和状态之间变化的情绪特异性因子来转换。可以使用与用于中立语音模型的决策树不同的情感特定决策树或其他聚类方案来聚集和存储与情绪相关的调整因子。

10. 发明申请

WO2021167732A1 IMPLEMENTING AUTOMATIC CHATTING DURING VIDEO DISPLAYING 审中-公开
公开(公告)号：WO2021167732A1
公开(公告)日：2021-08-26
申请号：PCT/US2021/014043
申请日：2021-01-20
申请人： MICROSOFT TECHNOLOGY LICENSING, LLC
发明人： XUE, Rui , XIU, Chi , LUAN, Jian
IPC分类号： H04N21/43 , H04N21/45 , H04L12/58 , G06F16/332 , G06F16/242
摘要： The present disclosure provides methods and apparatus for implementing automatic chatting during video displaying. User-side information may be obtained. Video information may be detected from the video. A response may be determined based at least on the user-side information and the video information. The response may be provided in a session.

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式