会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Trainable videorealistic speech animation
    • 可培训的视频讲话动画
    • US07168953B1
    • 2007-01-30
    • US10352319
    • 2003-01-27
    • Tomaso A. PoggioAntoine F. Ezzat
    • Tomaso A. PoggioAntoine F. Ezzat
    • G09B19/04
    • G06T13/205G06T13/40G10L2021/105
    • A method and apparatus for videorealistic, speech animation is disclosed. A human subject is recorded using a video camera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. The two key components of this invention are 1) a multidimensional morphable model (MMM) to synthesize new, previously unseen mouth configurations from a small set of mouth image prototypes; and 2) a trajectory synthesis technique based on regularization, which is automatically trained from the recorded video corpus, and which is capable of synthesizing trajectories in MMM space corresponding to any desired utterance.
    • 公开了一种用于录影动画的方法和装置。 当他/她发出预定的语音语料库时,使用摄像机记录人类对象。 在自动处理语料后,从能够合成人类受试者口中发出完全没有记录在原始视频中的新颖话语的数据中学习视觉语音模块。 将合成的话语重新合成到包含自然头部和眼睛运动的背景序列上。 在最后的输出是视觉感觉,它看起来像摄像机记录的主题。 本发明的两个关键组成部分是1)多维变形模型(MMM),用于从一小组嘴形图原型合成新的以前未见的口腔构型; 和2)基于正则化的轨迹合成技术,其由记录的视频语料库自动训练,并且能够合成对应于任何所需话语的MMM空间中的轨迹。
    • 2. 发明授权
    • Talking facial display method and apparatus
    • 谈话面部显示方法和装置
    • US06250928B1
    • 2001-06-26
    • US09223858
    • 1998-12-31
    • Tomaso A. PoggioAntoine F. Ezzat
    • Tomaso A. PoggioAntoine F. Ezzat
    • G09B1904
    • G09B19/04
    • A method and apparatus of converting input text into an audio-visual speech stream resulting in a talking face image enunciating the text. This method of converting input text into an audio-visual speech stream comprises the steps of: recording a visual corpus of a human-subject, building a viseme interpolation database, and synchronizing the talking face image with the text stream. In a preferred embodiment, viseme transitions are automatically calculated using optical flow methods, and morphing techniques are employed to result in smooth viseme transitions. The viseme transitions are concatenated together and synchronized with the phonemes according to the timing information. The audio-visual speech stream is then displayed in real time, thereby displaying a photo-realistic talking face.
    • 一种将输入文本转换成视听语音流的方法和装置,导致说出文本的说话面部图像。 这种将输入文本转换成视听语音流的方法包括以下步骤:记录人类对象的视觉语料库,构建视觉插值数据库,以及将谈话的脸部图像与文本流同步。 在优选的实施方案中,使用光学流动方法自动计算视标跃迁,并且采用变形技术以产生平滑的视觉转换。 视觉转换被连接在一起,并根据时间信息与音素同步。 然后,实时地显示视听语音流,从而显示照片般逼真的通话面孔。