会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • METHOD AND SYSTEM FOR ALIGNING NATURAL AND SYNTHETIC VIDEO TO SPEECH SYNTHESIS
    • 用于自然和合成视频对语音合成的方法和系统
    • US20080312930A1
    • 2008-12-18
    • US12193397
    • 2008-08-18
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • G10L13/00G10L13/08G06T13/00
    • G06T9/001G10L13/00G10L21/06H04N21/2368H04N21/4341
    • According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
    • 根据MPEG-4的TTS架构,面部动画可以由两个流同时驱动 - 文本和面部动画参数。 在该架构中,文本输入被发送到驱动面部的嘴形的解码器处的文本到语音转换器。 面部动画参数通过通信通道从编码器发送到脸部。 本发明包括发送到文本到语音转换器的文本串中的代码(称为书签),哪些书签放置在单词之间以及它们之间。 根据本发明,书签带有编码器时间戳。 由于文本到语音转换的性质,编码器时间戳与实际时间无关,应被解释为计数器。 此外,面部动画参数流携带与文本书签相同的编码器时间戳。 本发明的系统读取书签,并向面部动画系统提供编码器时间戳以及实时时间戳。 最后,面部动画系统使用书签的编码器时间戳作为参考,将正确的面部动画参数与实时时间戳相关联。
    • 3. 发明授权
    • Method and system for aligning natural and synthetic video to speech synthesis
    • 将自然和合成视频与语音合成对齐的方法和系统
    • US07584105B2
    • 2009-09-01
    • US11931093
    • 2007-10-31
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • G10L13/00G10L21/06G06T13/00
    • G06T9/001G10L13/00G10L21/06H04N21/2368H04N21/4341
    • According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
    • 根据MPEG-4的TTS架构,面部动画可以由两个流同时驱动 - 文本和面部动画参数。 在该架构中,文本输入被发送到驱动面部的嘴形的解码器处的文本到语音转换器。 面部动画参数通过通信通道从编码器发送到脸部。 本发明包括发送到文本到语音转换器的文本串中的代码(称为书签),哪些书签放置在单词之间以及它们之间。 根据本发明,书签带有编码器时间戳。 由于文本到语音转换的性质,编码器时间戳与实际时间无关,应被解释为计数器。 此外,面部动画参数流携带与文本书签相同的编码器时间戳。 本发明的系统读取书签,并向面部动画系统提供编码器时间戳以及实时时间戳。 最后,面部动画系统使用书签的编码器时间戳作为参考,将正确的面部动画参数与实时时间戳相关联。
    • 4. 发明授权
    • Method and system for aligning natural and synthetic video to speech synthesis
    • 将自然和合成视频与语音合成对齐的方法和系统
    • US07844463B2
    • 2010-11-30
    • US12193397
    • 2008-08-18
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • G10L13/00G06T13/00
    • G06T9/001G10L13/00G10L21/06H04N21/2368H04N21/4341
    • According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text and Facial Animation Parameters. A Text-To-Speech converter drives the mouth shapes of the face. An encoder sends Facial Animation Parameters to the face. The text input can include codes, or bookmarks, transmitted to the Text-to-Speech converter, which are placed between and inside words. The bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. The Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp and a real-time time stamp. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
    • 根据MPEG-4的TTS架构,面部动画可以同时由两个流驱动 - 文本和面部动画参数。 文字转语音转换器驱动脸部的嘴形。 编码器将面部动画参数发送到脸部。 文本输入可以包括发送到文本到语音转换器的代码或书签,其被放置在内部和内部的单词之间。 书签带有编码器时间戳。 由于文本到语音转换的性质,编码器时间戳与实际时间无关,应被解释为计数器。 面部动画参数流携带与文本书签相同的编码器时间戳。 系统读取书签并提供编码器时间戳和实时时间戳。 面部动画系统使用书签的编码器时间戳作为参考,将正确的面部动画参数与实时时间戳相关联。
    • 5. 发明授权
    • Method and system for aligning natural and synthetic video to speech synthesis
    • 将自然和合成视频与语音合成对齐的方法和系统
    • US07366670B1
    • 2008-04-29
    • US11464018
    • 2006-08-11
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • G10L13/00G06T13/00
    • G06T9/001G10L13/00G10L21/06H04N21/2368H04N21/4341
    • Facial animation in MPEG-4 can be driven by a text stream and a Facial Animation Parameters (FAP) stream. Text input is sent to a TTS converter that drives the mouth shapes of the face. FAPs are sent from an encoder to the face over the communication channel. Disclosed are codes bookmarks in the text string transmitted to the TTS converter. Bookmarks are placed between and inside words and carry an encoder time stamp. The encoder time stamp does not relate to real-world time. The FAP stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
    • MPEG-4中的面部动画可以由文本流和面部动画参数(FAP)流驱动。 文本输入被发送到驱动脸部嘴形的TTS转换器。 FAP通过通信通道从编码器发送到脸部。 公开了传输到TTS转换器的文本串中的代码书签。 书签放置在内部和内部,并携带编码器时间戳。 编码器时间戳与实际时间无关。 FAP流携带与文本书签相同的编码器时间戳。 系统读取书签,并向面部动画系统提供编码器时间戳以及实时时间戳。 面部动画系统使用书签的编码器时间戳作为参考,将正确的面部动画参数与实时时间戳相关联。
    • 7. 发明授权
    • Method and system for aligning natural and synthetic video to speech synthesis
    • 将自然和合成视频与语音合成对齐的方法和系统
    • US06862569B1
    • 2005-03-01
    • US10350225
    • 2003-01-23
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • Andrea BassoMark Charles BeutnagelJoern Ostermann
    • G10L13/04G10L15/24G10L21/06H04N7/26H04N7/50G10L13/00G06T13/00
    • G10L15/24G10L13/00G10L2021/105H04N19/20H04N19/46H04N19/61
    • According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry-an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
    • 根据MPEG-4的TTS架构,面部动画可以由两个流同时驱动 - 文本和面部动画参数。 在该架构中,文本输入被发送到驱动面部的嘴形的解码器处的文本到语音转换器。 面部动画参数通过通信通道从编码器发送到脸部。 本发明包括发送到文本到语音转换器的文本串中的代码(称为书签),哪些书签放置在单词之间以及它们之间。 根据本发明,书签携带编码器时间戳。 由于文本到语音转换的性质,编码器时间戳与实际时间无关,应被解释为计数器。 此外,面部动画参数流携带与文本书签相同的编码器时间戳。 本发明的系统读取书签,并向面部动画系统提供编码器时间戳以及实时时间戳。 最后,面部动画系统使用书签的编码器时间戳作为参考,将正确的面部动画参数与实时时间戳相关联。