会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 6. 发明申请
    • MULTILINGUAL IMAGE QUESTION ANSWERING
    • 多重图像问题解答
    • US20160342895A1
    • 2016-11-24
    • US15137179
    • 2016-04-25
    • Baidu USA, LLC
    • Haoyuan GaoJunhua MaoJie ZhouZhiheng HuangLei WangWei Xu
    • G06N5/02G06F17/27
    • G06F17/2881G06N3/0445G06N3/0454G06N5/04
    • Embodiments of a multimodal question answering (mQA) system are presented to answer a question about the content of an image. In embodiments, the model comprises four components: a Long Short-Term Memory (LSTM) component to extract the question representation; a Convolutional Neural Network (CNN) component to extract the visual representation; an LSTM component for storing the linguistic context in an answer, and a fusing component to combine the information from the first three components and generate the answer. A Freestyle Multilingual Image Question Answering (FM-IQA) dataset was constructed to train and evaluate embodiments of the mQA model. The quality of the generated answers of the mQA model on this dataset is evaluated by human judges through a Turing Test.
    • 呈现多模式问答(mqA)系统的实施例以回答关于图像内容的问题。 在实施例中,模型包括四个组件:提取问题表示的长短期存储器(LSTM)组件; 卷积神经网络(CNN)组件提取视觉表示; 用于将语言上下文存储在答案中的LSTM组件和用于组合来自前三个组件的信息并产生答案的定影组件。 构建自由式多语言图像问题回答(FM-IQA)数据集,以训练和评估mQA模型的实施方案。 人类法官通过图灵测试评估了该数据集上mQA模型生成的答案的质量。