专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20080312930A1 METHOD AND SYSTEM FOR ALIGNING NATURAL AND SYNTHETIC VIDEO TO SPEECH SYNTHESIS 有权
标题翻译：用于自然和合成视频对语音合成的方法和系统
公开(公告)号：US20080312930A1
公开(公告)日：2008-12-18
申请号：US12193397
申请日：2008-08-18
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L13/00 , G10L13/08 , G06T13/00
CPC分类号： G06T9/001 , G10L13/00 , G10L21/06 , H04N21/2368 , H04N21/4341
摘要： According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
摘要翻译：根据MPEG-4的TTS架构，面部动画可以由两个流同时驱动 - 文本和面部动画参数。在该架构中，文本输入被发送到驱动面部的嘴形的解码器处的文本到语音转换器。面部动画参数通过通信通道从编码器发送到脸部。本发明包括发送到文本到语音转换器的文本串中的代码（称为书签），哪些书签放置在单词之间以及它们之间。根据本发明，书签带有编码器时间戳。由于文本到语音转换的性质，编码器时间戳与实际时间无关，应被解释为计数器。此外，面部动画参数流携带与文本书签相同的编码器时间戳。本发明的系统读取书签，并向面部动画系统提供编码器时间戳以及实时时间戳。最后，面部动画系统使用书签的编码器时间戳作为参考，将正确的面部动画参数与实时时间戳相关联。

2. 发明授权

US06567779B1 Method and system for aligning natural and synthetic video to speech synthesis 失效
公开(公告)号：US06567779B1
公开(公告)日：2003-05-20
申请号：US08905931
申请日：1997-08-05
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L1300
CPC分类号： G10L15/24 , G10L13/00 , G10L2021/105 , H04N19/20 , H04N19/46 , H04N19/61
摘要： According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.

3. 发明授权

US07584105B2 Method and system for aligning natural and synthetic video to speech synthesis 有权
标题翻译：将自然和合成视频与语音合成对齐的方法和系统
公开(公告)号：US07584105B2
公开(公告)日：2009-09-01
申请号：US11931093
申请日：2007-10-31
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L13/00 , G10L21/06 , G06T13/00
CPC分类号： G06T9/001 , G10L13/00 , G10L21/06 , H04N21/2368 , H04N21/4341
摘要： According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
摘要翻译：根据MPEG-4的TTS架构，面部动画可以由两个流同时驱动 - 文本和面部动画参数。在该架构中，文本输入被发送到驱动面部的嘴形的解码器处的文本到语音转换器。面部动画参数通过通信通道从编码器发送到脸部。本发明包括发送到文本到语音转换器的文本串中的代码（称为书签），哪些书签放置在单词之间以及它们之间。根据本发明，书签带有编码器时间戳。由于文本到语音转换的性质，编码器时间戳与实际时间无关，应被解释为计数器。此外，面部动画参数流携带与文本书签相同的编码器时间戳。本发明的系统读取书签，并向面部动画系统提供编码器时间戳以及实时时间戳。最后，面部动画系统使用书签的编码器时间戳作为参考，将正确的面部动画参数与实时时间戳相关联。

4. 发明授权

US07844463B2 Method and system for aligning natural and synthetic video to speech synthesis 有权
标题翻译：将自然和合成视频与语音合成对齐的方法和系统
公开(公告)号：US07844463B2
公开(公告)日：2010-11-30
申请号：US12193397
申请日：2008-08-18
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L13/00 , G06T13/00
CPC分类号： G06T9/001 , G10L13/00 , G10L21/06 , H04N21/2368 , H04N21/4341
摘要： According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text and Facial Animation Parameters. A Text-To-Speech converter drives the mouth shapes of the face. An encoder sends Facial Animation Parameters to the face. The text input can include codes, or bookmarks, transmitted to the Text-to-Speech converter, which are placed between and inside words. The bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. The Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp and a real-time time stamp. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
摘要翻译：根据MPEG-4的TTS架构，面部动画可以同时由两个流驱动 - 文本和面部动画参数。文字转语音转换器驱动脸部的嘴形。编码器将面部动画参数发送到脸部。文本输入可以包括发送到文本到语音转换器的代码或书签，其被放置在内部和内部的单词之间。书签带有编码器时间戳。由于文本到语音转换的性质，编码器时间戳与实际时间无关，应被解释为计数器。面部动画参数流携带与文本书签相同的编码器时间戳。系统读取书签并提供编码器时间戳和实时时间戳。面部动画系统使用书签的编码器时间戳作为参考，将正确的面部动画参数与实时时间戳相关联。

5. 发明授权

US07366670B1 Method and system for aligning natural and synthetic video to speech synthesis 有权
标题翻译：将自然和合成视频与语音合成对齐的方法和系统
公开(公告)号：US07366670B1
公开(公告)日：2008-04-29
申请号：US11464018
申请日：2006-08-11
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L13/00 , G06T13/00
CPC分类号： G06T9/001 , G10L13/00 , G10L21/06 , H04N21/2368 , H04N21/4341
摘要： Facial animation in MPEG-4 can be driven by a text stream and a Facial Animation Parameters (FAP) stream. Text input is sent to a TTS converter that drives the mouth shapes of the face. FAPs are sent from an encoder to the face over the communication channel. Disclosed are codes bookmarks in the text string transmitted to the TTS converter. Bookmarks are placed between and inside words and carry an encoder time stamp. The encoder time stamp does not relate to real-world time. The FAP stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
摘要翻译： MPEG-4中的面部动画可以由文本流和面部动画参数（FAP）流驱动。文本输入被发送到驱动脸部嘴形的TTS转换器。 FAP通过通信通道从编码器发送到脸部。公开了传输到TTS转换器的文本串中的代码书签。书签放置在内部和内部，并携带编码器时间戳。编码器时间戳与实际时间无关。 FAP流携带与文本书签相同的编码器时间戳。系统读取书签，并向面部动画系统提供编码器时间戳以及实时时间戳。面部动画系统使用书签的编码器时间戳作为参考，将正确的面部动画参数与实时时间戳相关联。

6. 发明授权

US07110950B2 Method and system for aligning natural and synthetic video to speech synthesis 有权
公开(公告)号：US07110950B2
公开(公告)日：2006-09-19
申请号：US11030781
申请日：2005-01-07
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L13/00 , G06T13/00
CPC分类号： G10L15/24 , G10L13/00 , G10L2021/105 , H04N19/20 , H04N19/46 , H04N19/61
摘要： According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.

7. 发明授权

US06862569B1 Method and system for aligning natural and synthetic video to speech synthesis 有权
标题翻译：将自然和合成视频与语音合成对齐的方法和系统
公开(公告)号：US06862569B1
公开(公告)日：2005-03-01
申请号：US10350225
申请日：2003-01-23
申请人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
发明人： Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
IPC分类号： G10L13/04 , G10L15/24 , G10L21/06 , H04N7/26 , H04N7/50 , G10L13/00 , G06T13/00
CPC分类号： G10L15/24 , G10L13/00 , G10L2021/105 , H04N19/20 , H04N19/46 , H04N19/61
摘要： According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry-an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system of the present invention reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. Finally, the facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
摘要翻译：根据MPEG-4的TTS架构，面部动画可以由两个流同时驱动 - 文本和面部动画参数。在该架构中，文本输入被发送到驱动面部的嘴形的解码器处的文本到语音转换器。面部动画参数通过通信通道从编码器发送到脸部。本发明包括发送到文本到语音转换器的文本串中的代码（称为书签），哪些书签放置在单词之间以及它们之间。根据本发明，书签携带编码器时间戳。由于文本到语音转换的性质，编码器时间戳与实际时间无关，应被解释为计数器。此外，面部动画参数流携带与文本书签相同的编码器时间戳。本发明的系统读取书签，并向面部动画系统提供编码器时间戳以及实时时间戳。最后，面部动画系统使用书签的编码器时间戳作为参考，将正确的面部动画参数与实时时间戳相关联。

8. 发明授权

US07310811B1 Interaction modalities for multimedia delivery and presentation 失效
标题翻译：多媒体传送和演示的互动方式
公开(公告)号：US07310811B1
公开(公告)日：2007-12-18
申请号：US09113747
申请日：1998-07-10
申请人： Andrea Basso , Erich Haratsch , Barin Geoffry Haskell , Joern Ostermann
发明人： Andrea Basso , Erich Haratsch , Barin Geoffry Haskell , Joern Ostermann
IPC分类号： H04N7/173 , H04N7/16
CPC分类号： H04L65/80 , H04L65/4069 , H04L67/10 , H04L67/42 , H04N7/17318 , H04N21/2343 , H04N21/44209 , H04N21/4424 , H04N21/6371 , H04N21/6373 , H04N21/6377 , H04N21/6379 , H04N21/658
摘要： A method and apparatus for displaying received data, analyze the quality of the displayed data formulating a media-parameter suggestion for the encoder to alter the characteristics of data to be sent to the receiver, and sending from the receiver, the formulated suggestion.
摘要翻译：一种用于显示接收到的数据的方法和装置，分析所显示的数据的质量，为编码器制定媒体参数建议以改变要发送到接收机的数据的特性，以及从接收机发送所提出的建议。

9. 发明授权

US08266128B2 Interaction modalities for multimedia delivery and presentation 有权
标题翻译：多媒体传送和演示的互动方式
公开(公告)号：US08266128B2
公开(公告)日：2012-09-11
申请号：US11931196
申请日：2007-10-31
申请人： Andrea Basso , Erich Haratsch , Barin Geoffry Haskell , Joern Ostermann
发明人： Andrea Basso , Erich Haratsch , Barin Geoffry Haskell , Joern Ostermann
IPC分类号： G06F7/00
CPC分类号： H04L65/80 , H04L65/4069 , H04L67/10 , H04L67/42 , H04N7/17318 , H04N21/2343 , H04N21/44209 , H04N21/4424 , H04N21/6371 , H04N21/6373 , H04N21/6377 , H04N21/6379 , H04N21/658
摘要： A method and apparatus for displaying received data, analyze the quality of the displayed data formulating a media-parameter suggestion for the encoder to alter the characteristics of data to be sent to the receiver, and sending from the receiver, the formulated suggestion.
摘要翻译：一种用于显示接收到的数据的方法和装置，分析所显示的数据的质量，为编码器制定媒体参数建议以改变要发送到接收机的数据的特性，以及从接收机发送所提出的建议。

10. 发明申请

US20080114723A1 INTERACTION MODALITIES FOR MULTIMEDIA DELIVERY AND PRESENTATION 有权
标题翻译：多媒体交付和介绍的交互模式
公开(公告)号：US20080114723A1
公开(公告)日：2008-05-15
申请号：US11931196
申请日：2007-10-31
申请人： Andrea Basso , Erich Haratsch , Barin Geoffry Haskell , Joern Ostermann
发明人： Andrea Basso , Erich Haratsch , Barin Geoffry Haskell , Joern Ostermann
IPC分类号： G06F17/30
CPC分类号： H04L65/80 , H04L65/4069 , H04L67/10 , H04L67/42 , H04N7/17318 , H04N21/2343 , H04N21/44209 , H04N21/4424 , H04N21/6371 , H04N21/6373 , H04N21/6377 , H04N21/6379 , H04N21/658
摘要： A method and apparatus for displaying received data, analyze the quality of the displayed data formulating a media-parameter suggestion for the encoder to alter the characteristics of data to be sent to the receiver, and sending from the receiver, the formulated suggestion.
摘要翻译：一种用于显示接收到的数据的方法和装置，分析所显示的数据的质量，为编码器制定媒体参数建议以改变要发送到接收机的数据的特性，以及从接收机发送所提出的建议。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式