专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US6028960A Face feature analysis for automatic lipreading and character animation 失效
标题翻译：面部特征分析，用于自动修剪和角色动画
公开(公告)号：US6028960A
公开(公告)日：2000-02-22
申请号：US716959
申请日：1996-09-20
申请人： Hans Peter Graf , Eric David Petajan
发明人： Hans Peter Graf , Eric David Petajan
IPC分类号： G06K9/00 , G06K9/46
CPC分类号： G06K9/00268
摘要： A face feature analysis which begins by generating multiple face feature candidates, e.g., eyes and nose positions, using an isolated frame face analysis. Then, a nostril tracking window is defined around a nose candidate and tests are applied to the pixels therein based on percentages of skin color area pixels and nostril area pixels to determine whether the nose candidate represents an actual nose. Once actual nostrils are identified, size, separation and contiguity of the actual nostrils is determined by projecting the nostril pixels within the nostril tracking window. A mouth window is defined around the mouth region and mouth detail analysis is then applied to the pixels within the mouth window to identify inner mouth and teeth pixels and therefrom generate an inner mouth contour. The nostril position and inner mouth contour are used to generate a synthetic model head. A direct comparison is made between the inner mouth contour generated and that of a synthetic model head and the synthetic model head is adjusted accordingly. Vector quantization algorithms may be used to develop a codebook of face model parameters to improve processing efficiency. The face feature analysis is suitable regardless of noise, illumination variations, head tilt, scale variations and nostril shape.
摘要翻译：一种面部特征分析，其通过使用孤立的框架面分析来生成多个面部特征候选物，例如眼睛和鼻子位置。然后，围绕鼻子候选者定义鼻孔跟踪窗口，并且基于皮肤颜色面积像素和鼻孔区域像素的百分比将测试应用于其中的像素，以确定鼻子候选者是否表示实际的鼻子。一旦确定了实际鼻孔，就可以通过在鼻孔跟踪窗口内投射鼻孔像素来确定实际鼻孔的尺寸，分离和连续性。嘴口围绕嘴部区域定义，然后将口腔细节分析应用于口腔内的像素，以识别内部嘴和牙齿像素，从而产生内部嘴部轮廓。鼻孔位置和内嘴轮廓用于产生合成模型头。直接比较所生成的内口轮廓和合成模型头的内口轮廓，并相应地调整合成模型头。矢量量化算法可用于开发面部模型参数的码本，以提高处理效率。面部特征分析适用于噪声，照明变化，头部倾斜，尺度变化和鼻孔形状。

2. 发明授权

US08638846B1 Systems and methods for encoding and decoding video streams 有权
标题翻译：视频流编码和解码的系统和方法
公开(公告)号：US08638846B1
公开(公告)日：2014-01-28
申请号：US10601495
申请日：2003-06-23
申请人： Eric Cosatto , Hans Peter Graf , Joern Ostermann
发明人： Eric Cosatto , Hans Peter Graf , Joern Ostermann
IPC分类号： H04N7/12
CPC分类号： H04N19/573 , G06T9/001 , H04N19/21 , H04N19/23 , H04N19/55
摘要： Systems and methods for encoding/decoding a video stream. Animated talking heads are coded using partial offline encoding, multiple video streams, and multiple reference frames. The content of a face animation video that is known beforehand is encoded offline and the remaining content is encoded online and included in the video stream. To reduce bit rate, a server can stream multiple video sequences to the client and the video sequences are stored in the client's frame store. The server sends instructions to play a particular video sequence instead of streaming the particular video sequence. Multiple video streams can also be streamed to the client. Positional data and blending data are also sent to properly position one video stream relative to another video stream and to blend one video stream into another video stream.
摘要翻译：用于对视频流进行编码/解码的系统和方法。动画通话头使用部分离线编码，多个视频流和多个参考帧进行编码。预先知道的面部动画视频的内容被离线编码，并且剩余的内容被在线编码并包括在视频流中。为了降低比特率，服务器可以将多个视频序列流式传输到客户端，视频序列存储在客户端的帧存储器中。服务器发送播放特定视频序列的指令，而不是流式传输特定的视频序列。多个视频流也可以流式传输到客户端。还发送位置数据和混合数据以相对于另一个视频流适当地定位一个视频流，并将一个视频流混合到另一视频流中。

3. 发明申请

US20120069139A1 DIGITALLY-GENERATED LIGHTING FOR VIDEO CONFERENCING APPLICATIONS 有权
标题翻译：数字照明应用的数字照明
公开(公告)号：US20120069139A1
公开(公告)日：2012-03-22
申请号：US13303325
申请日：2011-11-23
申请人： Andrea Basso , Eric Cosatto , David Crawford Goibbon , Hans Peter Graf , Shan Liu
发明人： Andrea Basso , Eric Cosatto , David Crawford Goibbon , Hans Peter Graf , Shan Liu
IPC分类号： H04N7/14
CPC分类号： G06K9/00281 , G06F3/012 , G06K9/00228 , G06K9/2027 , G06T7/246 , G06T15/50 , G06T17/00 , G06T2200/08 , G06T2207/10152 , G06T2207/30201 , H04N7/147 , H04N7/15
摘要： A method of improving the lighting conditions of a real scene or video sequence. Digitally generated light is added to a scene for video conferencing over telecommunication networks. A virtual illumination equation takes into account light attenuation, lambertian and specular reflection. An image of an object is captured, a virtual light source illuminates the object within the image. In addition, the object can be the head of the user. The position of the head of the user is dynamically tracked so that an three-dimensional model is generated which is representative of the head of the user. Synthetic light is applied to a position on the model to form an illuminated model.
摘要翻译：一种改善真实场景或视频序列的照明条件的方法。数字生成的光被添加到通过电信网络进行视频会议的场景中。虚拟照明方程考虑了光衰减，朗伯和镜面反射。捕获对象的图像，虚拟光源照亮图像内的对象。另外，对象可以是用户的头。动态跟踪用户头部的位置，从而生成代表用户头部的三维模型。将合成光应用于模型上的位置以形成照明模型。

4. 发明授权

US08086066B2 Digitally-generated lighting for video conferencing applications 有权
标题翻译：用于视频会议应用的数字生成照明
公开(公告)号：US08086066B2
公开(公告)日：2011-12-27
申请号：US12877644
申请日：2010-09-08
申请人： Andrea Basso , Eric Cosatto , David Crawford Gibbon , Hans Peter Graf , Shan Liu
发明人： Andrea Basso , Eric Cosatto , David Crawford Gibbon , Hans Peter Graf , Shan Liu
IPC分类号： G06K9/40 , G06K9/00 , H04N7/14
CPC分类号： G06K9/00281 , G06F3/012 , G06K9/00228 , G06K9/2027 , G06T7/246 , G06T15/50 , G06T17/00 , G06T2200/08 , G06T2207/10152 , G06T2207/30201 , H04N7/147 , H04N7/15
摘要： A method of improving the lighting conditions of a real scene or video sequence. Digitally generated light is added to a scene for video conferencing over telecommunication networks. A virtual illumination equation takes into account light attenuation, lambertian and specular reflection. An image of an object is captured, a virtual light source illuminates the object within the image. In addition, the object can be the head of the user. The position of the head of the user is dynamically tracked so that an three-dimensional model is generated which is representative of the head of the user. Synthetic light is applied to a position on the model to form an illuminated model.
摘要翻译：一种改善真实场景或视频序列的照明条件的方法。数字生成的光被添加到通过电信网络进行视频会议的场景中。虚拟照明方程考虑了光衰减，朗伯和镜面反射。捕获对象的图像，虚拟光源照亮图像内的对象。另外，对象可以是用户的头。动态跟踪用户头部的位置，从而生成代表用户头部的三维模型。将合成光应用于模型上的位置以形成照明模型。

5. 发明授权

US07921013B1 System and method for sending multi-media messages using emoticons 有权
标题翻译：使用表情发送多媒体消息的系统和方法
公开(公告)号：US07921013B1
公开(公告)日：2011-04-05
申请号：US11214666
申请日：2005-08-30
申请人： Joern Ostermann , Mehmet Reha Civanlar , Eric Cosatto , Hans Peter Graf , Yann Andre LeCun
发明人： Joern Ostermann , Mehmet Reha Civanlar , Eric Cosatto , Hans Peter Graf , Yann Andre LeCun
IPC分类号： G10L13/00 , G10L13/06 , G10L21/00
CPC分类号： G06Q10/107 , G06F17/241 , G10L13/00
摘要： A system and method of providing sender-customization of multi-media messages through the use of emoticons is disclosed. The sender inserts the emoticons into a text message. As an animated face audibly delivers the text, emoticons associated with the message are started a predetermined period of time or number of words prior to the position of the emoticon in the message text and completed a predetermined length of time or number of words following the location of the emoticon. The sender may insert emoticons through the use of emoticon buttons that are icons available for choosing. Upon sender selections of an emoticon, an icon representing the emoticon is inserted into the text at the position of the cursor. Once an emoticon is chosen, the sender may also choose the amplitude for the emoticon and increased or decreased amplitude will be displayed in the icon inserted into the message text.
摘要翻译：公开了通过使用表情符号来提供多媒体消息的发送者定制的系统和方法。发件人将表情符号插入文本消息。作为动画面部听觉地传送文本，与消息相关联的表情符号在消息文本中表情符号的位置之前的预定时间段或字数开始，并且完成预定时间长度或位置之后的字数的表情符号。发件人可以通过使用可用于选择的图标的表情符号按钮来插入表情符号。在表情符号的发送者选择之后，表示表情符号的图标被插入到光标位置的文本中。一旦选择了表情符号，发送者也可以选择表情符号的幅度，增加或减小的幅度将显示在插入消息文本的图标中。

6. 发明授权

US07353177B2 System and method of providing conversational visual prosody for talking heads 有权
公开(公告)号：US07353177B2
公开(公告)日：2008-04-01
申请号：US11237557
申请日：2005-09-28
申请人： Eric Cosatto , Hans Peter Graf , Thomas M. Isaacson , Volker Franz Storm
发明人： Eric Cosatto , Hans Peter Graf , Thomas M. Isaacson , Volker Franz Storm
IPC分类号： G10L11/00
CPC分类号： G06T13/40 , G10L21/06
摘要： A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.

7. 发明授权

US07076430B1 System and method of providing conversational visual prosody for talking heads 有权
公开(公告)号：US07076430B1
公开(公告)日：2006-07-11
申请号：US10173341
申请日：2002-06-17
申请人： Eric Cosatto , Hans Peter Graf , Thomas M. Isaacson , Volker Franz Strom
发明人： Eric Cosatto , Hans Peter Graf , Thomas M. Isaacson , Volker Franz Strom
IPC分类号： G10L11/00
CPC分类号： G06T13/40 , G10L21/06
摘要： A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.

8. 发明授权

US06963839B1 System and method of controlling sound in a multi-media communication application 有权
标题翻译：多媒体通信应用中的声音控制系统及方法
公开(公告)号：US06963839B1
公开(公告)日：2005-11-08
申请号：US09999526
申请日：2001-11-02
申请人： Joern Ostermann , Mehmet Reha Civanlar , Hans Peter Graf , Thomas M. Isaacson
发明人： Joern Ostermann , Mehmet Reha Civanlar , Hans Peter Graf , Thomas M. Isaacson
IPC分类号： G06F17/28 , G10L13/00 , G10L13/08 , G10L21/00
CPC分类号： G10L13/08
摘要： A method for customizing a voice in a multi-media message created by a sender for a recipient is disclosed. The multi-media message comprises a text message from the sender to be delivered by an animated entity. The method comprises presenting an option to the sender to insert voice emoticons into the text message associated with parameters of a voice used by the animated entity to deliver the text message. The message is then delivered wherein the voice of the animated entity is modified throughout the message according to the voice emoticons. The voice emoticons may relate to features such as voice stress, volume, pauses, emotion, yelling, or whispering. After the sender inserts various voice emoticons into the text of the message, the animated entity delivers the multi-media message giving effect to each voice emoticon in the text. A volume or intensity of the voice emoticons may be given effect by repeating the tags. In this case, delivering the multi-media message further comprises delivering the multi-media message at a variable level associated with the number of times a respective voice emoticon is repeated. In this manner, the sender may control the presentation of the message to increase the overall effectiveness of the multi-media message.
摘要翻译：公开了一种用于定制由发送方为接收者创建的多媒体消息中的语音的方法。多媒体消息包括来自发送者的要被动画实体递送的文本消息。该方法包括向发送者呈现选项以将语音表情符号插入与由动画实体使用的语音的参数相关联的文本消息中以递送文本消息。然后传递消息，其中根据语音表情符号在整个消息中修改动画实体的语音。语音表情符号可能与语音压力，音量，停顿，情感，吼叫或耳语等功能有关。在发送者将各种语音表情符号插入消息的文本之后，动画实体递送多媒体消息，使文本中的每个语音表情符合效果。语音表情符号的音量或强度可以通过重复标签来实现。在这种情况下，传送多媒体消息还包括以与重复相应的语音表情符号的次数相关联的可变级别递送多媒体消息。以这种方式，发送者可以控制消息的呈现以增加多媒体消息的整体有效性。

9. 发明授权

US5864630A Multi-modal method for locating objects in images 失效
标题翻译：用于在图像中定位对象的多模态方法
公开(公告)号：US5864630A
公开(公告)日：1999-01-26
申请号：US752109
申请日：1996-11-20
申请人： Eric Cosatto , Hans Peter Graf
发明人： Eric Cosatto , Hans Peter Graf
IPC分类号： G06T7/00 , G06K9/00
CPC分类号： G06K9/00234 , G06K9/00261 , G06T7/0018 , G06T7/0042
摘要： A multi-modal method for locating objects in images wherein a tracking analysis is first performed using a plurality of channels which may comprise a shape channel, a color channel, and a motion channel. After a predetermined number of frames, intermediate feature representations are obtained from each channel and evaluated for reliability. Based on the evaluation of each channel, one or more channels are selected for additional tracking. The results of all representations are ultimately integrated into a final tracked output. Additionally, any of the channels may be calibrated using initial results obtained from one or more channels.
摘要翻译：用于在图像中定位对象的多模式方法，其中首先使用可以包括形状通道，颜色通道和运动通道的多个通道执行跟踪分析。在预定数量的帧之后，从每个信道获得中间特征表示并对可靠性进行评估。根据每个频道的评估，选择一个或多个频道进行追踪。所有表示的结果最终被集成到最终的跟踪输出中。另外，可以使用从一个或多个信道获得的初始结果来校准任何信道。

10. 发明授权

US09583098B1 System and method for triphone-based unit selection for visual speech synthesis 有权
标题翻译：用于视觉语音合成的基于耳机的单元选择的系统和方法
公开(公告)号：US09583098B1
公开(公告)日：2017-02-28
申请号：US11924025
申请日：2007-10-25
申请人： Eric Cosatto , Hans Peter Graf , Fu Jie Huang
发明人： Eric Cosatto , Hans Peter Graf , Fu Jie Huang
IPC分类号： G10L15/26 , G06T15/70 , G10L15/08 , G10L13/07 , H04N19/00
CPC分类号： G10L15/08 , G10L13/07 , G10L15/26 , G10L2021/105 , H04N19/00
摘要： A system and method for generating a video sequence having mouth movements synchronized with speech sounds are disclosed. The system utilizes a database of n-phones as the smallest selectable unit, wherein n is larger than 1 and preferably 3. The system calculates a target cost for each candidate n-phone for a target frame using a phonetic distance, coarticulation parameter, and speech rate. For each n-phone in a target sequence, the system searches for candidate n-phones that are visually similar according to the target cost. The system samples each candidate n-phone to get a same number of frames as in the target sequence and builds a video frame lattice of candidate video frames. The system assigns a joint cost to each pair of adjacent frames and searches the video frame lattice to construct the video sequence by finding the optimal path through the lattice according to the minimum of the sum of the target cost and the joint cost over the sequence.
摘要翻译：公开了一种用于产生具有与语音的同步的口部动作的视频序列的系统和方法。该系统利用n电话的数据库作为最小的可选单元，其中n大于1，并且优选地为3.系统使用语音距离，协调参数和目标帧来计算目标帧的每个候选n电话的目标成本言语速度对于目标序列中的每个n电话，系统根据目标成本搜索视觉上类似的候选n电话。系统对每个候选n电话进行采样，以获得与目标序列相同数量的帧，并建立候选视频帧的视频帧格点。系统为每对相邻帧分配联合成本，并通过根据目标成本和序列中的联合成本的总和的最小值找到通过网格的最优路径来搜索视频帧格以构建视频序列。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式