会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明申请
    • SYNTHETIC AUDIOVISUAL STORYTELLER
    • US20150042662A1
    • 2015-02-12
    • US14455573
    • 2014-08-08
    • KABUSHIKI KAISHA TOSHIBA
    • Javier Latorre-MartinezVincent Ping Leung WanBalakrishna Venkata Jagannadha KolluruIoannis StylianouRobert Arthur BloklandNorbert BraunschweilerKayoko YanagisawaLangzhou ChenRanniery MAIARobert AndersonBjorn StengerRoberto CipollaNeil Baker
    • G06T13/20G06F3/16
    • G06T13/205G06F3/16G06T13/40G06T13/80G10L21/10G10L2021/105
    • A method of animating a computer generation of a head and displaying the text of an electronic book, such that the head has a mouth which moves in accordance with the speech of the text of the electronic book to be output by the head and a word or group of words from the text is displayed while simultaneously being mimed by the mouth, said method comprising: inputting the text of said book; dividing said input text into a sequence of acoustic units; determining expression characteristics for the inputted text; calculating a duration for each acoustic unit using a duration model; converting said sequence of acoustic units to a sequence of image vectors using a statistical model, wherein said model has a plurality of model parameters describing probability distributions which relate an acoustic unit to an image vector, said image vector comprising a plurality of parameters which define a face of said head; converting said sequence of acoustic units into a sequence of text display indicators using an text display model, wherein converting said sequence of acoustic units to said sequence of text display indicators comprises using the calculated duration of each acoustic unit to determine the timing and duration of the display of each section of text; outputting said sequence of image vectors as video such that the mouth of said head moves to mime the speech associated with the input text with the selected expression, wherein a parameter of a predetermined type of each probability distribution in said selected expression is expressed as a weighted sum of parameters of the same type, and wherein the weighting used is expression dependent, such that converting said sequence of acoustic units to a sequence of image vectors comprises retrieving the expression dependent weights for said selected expression, wherein the parameters are provided in clusters, and each cluster comprises at least one sub-cluster, wherein said expression dependent weights are retrieved for each cluster such that there is one weight per sub-cluster; and outputting said sequence of text display indicators as video which is synchronised with the lip movement of the head.