Open-vocabulary Attribute Detection

Bravo, Maria A.; Mittal, Sudhanshu; Ging, Simon; Brox, Thomas

Open-vocabulary Attribute Detection

Maria A. Bravo, Sudhanshu Mittal, Simon Ging, Thomas Brox

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2023

Abstract: Vision-language modeling has enabled open-vocabulary tasks where predictions can be queried using any text prompt in a zero-shot manner. Existing open-vocabulary tasks focus on object classes, whereas research on object attributes is limited due to the lack of a reliable attribute-focused evaluation benchmark. This paper introduces the Open-Vocabulary Attribute Detection (OVAD) task and the corresponding OVAD benchmark. The objective of the novel task and benchmark is to probe object-level attribute information learned by vision-language models. To this end, we created a clean and densely annotated test set cov- ering 117 attribute classes on the 80 object classes of MS COCO. It includes positive and negative annotations, which enables open-vocabulary evaluation. Overall, the bench- mark consists of 1.4 million annotations. For reference, we provide a first baseline method for open-vocabulary at- tribute detection. Moreover, we demonstrate the bench- markâ€™s value by studying the attribute detection perfor- mance of several foundation models.

Paper

Supplementary

Poster

Downloads

Images and movies

BibTex reference

@InProceedings{BMGB23,
  author       = "M. Bravo and S. Mittal and S. Ging and T. Brox",
  title        = "Open-vocabulary Attribute Detection",
  booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
  month        = "Jun",
  year         = "2023",
  url          = "http://lmbweb.informatik.uni-freiburg.de/Publications/2023/BMGB23"
}

Other publications in the database

» Maria A. Bravo
» Sudhanshu Mittal
» Simon Ging
» Thomas Brox