Abstract: Deep image compression and text-to-image generation represent two distinct paradigms in visual representation learning: one focuses on coded representations, while the other emphasizes ...