Generative Pretraining from Pixels

Generative Pretraining from Pixels

Inspired by progress in unsupervised representation learning for natural language, we examine whether similar models can learn useful representations for images. We train a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure. Despite training on low-resolution ImageNet without  labels, we find that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification…

Source: https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf

Subscribe to our Digest