Visual Question Answering Using Semantic Information from Image Descriptions

Tasmia Tasmia; Md Sultan Al Nahian; Brent Harrison

doi:10.32473/flairs.v34i1.128460

Visual Question Answering Using Semantic Information from Image Descriptions

Autores

Tasmia Tasmia University of Kentucky
Md Sultan Al Nahian University of Kentucky
Brent Harrison University of Kentucky

DOI:

https://doi.org/10.32473/flairs.v34i1.128460

Palavras-chave:

Deep Learning, Natural Language Processing, Computer Vision

Resumo

In this work, we propose a deep neural architecture that uses an attention mechanism which utilizes region based image features, the natural language question asked, and semantic knowledge extracted from the regions of an image to produce open-ended answers for questions asked in a visual question answering (VQA) task. The combination of both region based features and region based textual information about the image bolsters a model to more accurately respond to questions and potentially do so with less required training data. We evaluate our proposed architecture on a VQA task against a strong baseline and show that our method achieves excellent results on this task.

Downloads

PDF (English)

Publicado

2021-04-18

Como Citar

Tasmia, T., Nahian, M. S. A., & Harrison, B. (2021). Visual Question Answering Using Semantic Information from Image Descriptions. The International FLAIRS Conference Proceedings, 34. https://doi.org/10.32473/flairs.v34i1.128460

Baixar Citação

Edição

v. 34 (2021): Proceedings of FLAIRS-34

Seção

Main Track Proceedings

Visual Question Answering Using Semantic Information from Image Descriptions

Autores

DOI:

Palavras-chave:

Resumo

Downloads

Publicado

Como Citar

Edição

Seção

Desenvolvido por

Enviar Submissão

Idioma