1. Dual-Conditioned Training to Exploit Pre-Trained Codebook-Based Generative Model in Image Compression
- Author
-
Shoma Iwai, Tomo Miyazaki, and Shinichiro Omachi
- Subjects
Generative adversarial networks ,image compression ,VQGAN ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Learned image compression (LIC) is increasingly gaining attention. To improve the perceptual quality of reconstructions, generative LIC has been studied, using generative models such as Generative Adversarial Networks (GANs). State-of-the-art generative LIC methods have achieved remarkable performance even in low bit rate settings. Unlike most approaches trained from scratch, we propose a generative LIC that utilizes a pre-trained codebook-based generative model, Vector-Quantized GAN (VQGAN). Specifically, our model is designed to exploit its powerful image-generation capabilities to enhance compression performance. Our approach reconstructs an image from a transmitted bitstream in two steps: (1) estimating VQGAN tokens and feeding them into the pre-trained VQGAN decoder, and (2) modifying the decoder’s intermediate features to address artifacts and distortions. Our preliminary experiments reveal that the information allocation between (1) and (2) is pivotal for reconstruction quality. Moreover, we found that the ideal allocation varies based on the target bit rate. Motivated by these findings, we propose a novel Dual-Conditioned training. Through the training, the model learns to adjust the total bit rate and information allocation between (1) and (2) based on two conditional inputs. Subsequently, we explore the conditional inputs to achieve the optimal results for each target bit rate. This training strategy enables us to effectively exploit the generation capability of VQGAN across different bit rates. Our method, named Dual Conditioned VQGAN-based Image Compression (DC-VIC), outperforms state-of-the-art generative LIC methods in rate-distortion-perception performance. Code will be available at https://github.com/iwa-shi/DC_VIC
- Published
- 2024
- Full Text
- View/download PDF