Entries for the latest VQA v2 challenge close on Monday morning, and we’re currently number one amongst the entries that have been submitted thus far. There will be more submitted as the deadline approaches, but we’ll improve our performance also. So we’re not there yet, but we’re in a very good position.
The task of Visual Question Answering (VQA) has received increasing interest from researchers in Computer Vision and Natural Language Processing. The field of computer vision has recently seen tremendous advances with the success of deep learning, in particular on low- and mid-level tasks, such as image segmentation or object recognition. These advances have fueled the confidence of researchers for tackling more complex tasks that combine vi-
sion with language and high-level reasoning, and VQA is the prime example of this trend.
The approach we’ve submitted builds on our work to incorporate other sources of information, and other methods we developed to help achieve Zero-Shot VQA.