Object detection is the task of detecting instances of objects of a certain class within an image. Microsoft's Common Objects in Context (COCO) is a dataset from 2014 that is used to benchmark object recognition. The data places the question of object recognition in the context of the broader question of scene understanding. It contains images of complex everyday scenes containing common objects in their natural context. COCO includes a total of 2.5 million labeled instances in 328k images.
As of writing this question, the state-of-the-art model for is Cascade Eff-B7 NAS-FPN (Ghiasi et al., 2020), which achieves a box average precision (box AP) of 57.3.
An excellent reference for tracking state-of-the-art models is PapersWithCode, which tracks performance data of ML models.
What will the state-of-the-art object detection performance on COCO be, on 2023-02-14 in box average precision (box AP) amongst all models?
This question resolves as the highest level of performance in box AP achieved on COCO test-dev (COCO's test set) up until 2023-02-14 11:59 GMT. Models trained on additional dataset do qualify. Moreover,, models using Test Time Augmentations may also qualify.
Performance figures may be taken from e-prints, conference papers, peer-reviewed articles, and blog articles by reputable AI labs (including the associated code repositories). Published performance figures must be available before 2023-02-14, 11:59PM GMT to qualify.
In case the relevant performance figure is given as a confidence interval, the median value will be used to resolve the question.