SOTA ImageNet 2022-01-14


This question is part of the Hill Climbing Round of the Forecasting AI Progress Tournament. You can view all other questions in this round here.

Image Classification is the task of identifying an image by assigning to it a specific label. Typically, Image Classification refers to images in which only one object appears and is analysed. In contrast, object detection involves both classification and localisation tasks, and is used to analyse more realistic cases in which multiple objects may exist in an image.

ImageNet (Deng et al., 2009) is a large scale dataset images built upon the backbone of the WordNet structure. ImageNet is one of the largest visual recognition datasets which contains high-resolution images. It has tens of millions of annotated images organized by the semantic hierarchy of WordNet.

As of writing this question, the state-of-the-art model for is EfficientNet-B8 (Wei et al., 2020), which achieves a top-1 accuracy of 85.8% on ImageNet.

An excellent reference for tracking state-of-the-art models is PapersWithCode, which tracks performance data of ML models.

What will the state-of-the-art performance on image classification on ImageNet be at 2022-01-14 in top-1 accuracy amongst models not trained on additional data?

This question resolves as the highest level of performance, in top-1 accuracy, achieved on ImageNet up until 2022-01-14, 11:59PM GMT amongst models trained on only the ImageNet's validation set (ImageNet does not clearly demarcate its validation and training sets). No extra training data may be used besides the original ImageNet dataset.

For the purpose of this question, augmented versions of the ImageNet dataset, such as ImageNet-V2 (Recht, 2019), are considered different from Deng et al., 2009's dataset.

Performance figures may be taken from e-prints, conference papers, peer-reviewed articles, and blog articles by reputable AI labs (including the associated code repositories). Published performance figures must be available before 2022-01-14, 11:59PM GMT to qualify.

In case the relevant performance figure is given as a confidence interval, the median value will be used to resolve the question.

