专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

2. 发明申请

US20150042662A1 SYNTHETIC AUDIOVISUAL STORYTELLER 有权
公开(公告)号：US20150042662A1
公开(公告)日：2015-02-12
申请号：US14455573
申请日：2014-08-08
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Javier Latorre-Martinez , Vincent Ping Leung Wan , Balakrishna Venkata Jagannadha Kolluru , Ioannis Stylianou , Robert Arthur Blokland , Norbert Braunschweiler , Kayoko Yanagisawa , Langzhou Chen , Ranniery MAIA , Robert Anderson , Bjorn Stenger , Roberto Cipolla , Neil Baker
IPC分类号： G06T13/20 , G06F3/16
CPC分类号： G06T13/205 , G06F3/16 , G06T13/40 , G06T13/80 , G10L21/10 , G10L2021/105
摘要： A method of animating a computer generation of a head and displaying the text of an electronic book, such that the head has a mouth which moves in accordance with the speech of the text of the electronic book to be output by the head and a word or group of words from the text is displayed while simultaneously being mimed by the mouth, said method comprising: inputting the text of said book; dividing said input text into a sequence of acoustic units; determining expression characteristics for the inputted text; calculating a duration for each acoustic unit using a duration model; converting said sequence of acoustic units to a sequence of image vectors using a statistical model, wherein said model has a plurality of model parameters describing probability distributions which relate an acoustic unit to an image vector, said image vector comprising a plurality of parameters which define a face of said head; converting said sequence of acoustic units into a sequence of text display indicators using an text display model, wherein converting said sequence of acoustic units to said sequence of text display indicators comprises using the calculated duration of each acoustic unit to determine the timing and duration of the display of each section of text; outputting said sequence of image vectors as video such that the mouth of said head moves to mime the speech associated with the input text with the selected expression, wherein a parameter of a predetermined type of each probability distribution in said selected expression is expressed as a weighted sum of parameters of the same type, and wherein the weighting used is expression dependent, such that converting said sequence of acoustic units to a sequence of image vectors comprises retrieving the expression dependent weights for said selected expression, wherein the parameters are provided in clusters, and each cluster comprises at least one sub-cluster, wherein said expression dependent weights are retrieved for each cluster such that there is one weight per sub-cluster; and outputting said sequence of text display indicators as video which is synchronised with the lip movement of the head.

IPRDB

热门服务

关于我们

友情链接

联系方式