Colorization as a Proxy Task for Visual Understanding

We investigate and improve self-supervision as a drop-in replacement for ImageNet pretraining, focusing on automatic colorization as the proxy task. Self-supervised training has been shown to be more promising for utilizing unlabeled data than other, traditional unsupervised learning methods. We show the ability of our self-supervised network in several contexts. On VOC segmentation and classification tasks, we present results that are state-of-the-art among methods not using ImageNet labels for pretraining representations.

Self-supervision is a family of alternative pretraining methods that do not require any labeled data, since labels are “manufactured” through unlabeled data.