Australian Centre for Visual Technologies

BLOGS WEBSITE

Number one in the world in Visual Question Answering again, for now

Posted on June 23, 2017 by Anton van den Hengel

Entries for the latest VQA v2 challenge close on Monday morning, and we’re currently number one amongst the entries that have been submitted thus far. There will be more submitted as the deadline approaches, but we’ll improve our performance also. So we’re not there yet, but we’re in a very good position.

The task of Visual Question Answering (VQA) has received increasing interest from researchers in Computer Vision and Natural Language Processing. The field of computer vision has recently seen tremendous advances with the success of deep learning, in particular on low- and mid-level tasks, such as image segmentation or object recognition. These advances have fueled the confidence of researchers for tackling more complex tasks that combine vi-
sion with language and high-level reasoning, and VQA is the prime example of this trend.

The approach we’ve submitted builds on our work to incorporate other sources of information, and other methods we developed to help achieve Zero-Shot VQA.

Related Posts

We’re number one in VQA 2.0

A team led by Damien Teney (ACVT) and Peter Anderson (ACRV, ANU, and Microsoft) has just placed first in the VQA 2.0 challenge. Other members of the team include David Golub from Stanford, Po-Seng Huang, Lei Zhang and Xiaodong He from Microsoft, and Anton van den Hengel from ACVT. The leaderboard is here.

Number 1 in Semantic Segmentation

Congratulations to Zifeng and Chunhua on having made it to the top of the Cityscapes leaderboard again. Cityscapes is a semantic segmentation dataset of city scenes, and a hotly contested international challenge. The challenge is to separate the pixels belonging to different classes of objects. Semantic Segmentation is one of the fundamental challenges in computer […]

Number 2 in ImageNet Scene Parsing Challenge 2016

We’ve had another great year in the ImageNet competition. We came 2nd in the Scene Parsing challenge, which requires pixelwise segmentation of a large set of images into 150 classes of things and stuff. The ImageNet Challenge is one of the most hotly contested challenges in Computer Vision, and is constantly updated to reflect the current […]

Great ImageNet Detection Results

Last week was the deadline for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2015) large-scale object detection task. This is the primary challenge for image-based object detection. The challenge requires that you detect 200 classes of objects in a set of test images. For each image, algorithms must produce a set of annotations (ci,si,bi)of […]

World’s best Coco

The Microsoft COCO Captioning Challenge is designed to spur the development of algorithms producing image captions that are informative and accurate. There are 18 teams all together and our attributes based image captioning framework currently achieves the best result on 3 evaluation metrics (BLEU-1,2,3) out of 7. We also achieve the top-5 ranking on the […]

This entry was posted in News, Visual Question Answering and tagged Competition, research, Visual Question Answering. Bookmark the permalink.

Comments are closed.

The Australian Centre for Visual Technologies

Address

Level 5, Ingkarni Wardli,
The University of Adelaide,
North Terrace,
Adelaide, 5005

Contact