* Equal contribution
Technion — Israel Institute of Technology
We present a novel generative approach based on Denoising Diffusion Models (DDMs), which produces high-quality image samples along with their losslessly compressed bit-stream representations. This is obtained by replacing the standard Gaussian noise sampling in the reverse diffusion with a selection of noise samples from pre-defined codebooks of fixed iid Gaussian vectors. Surprisingly, we find that our method, termed Denoising Diffusion Codebook Model (DDCM), retains sample quality and diversity of standard DDMs, even for extremely small codebooks. We leverage DDCM and pick the noises from the codebooks that best match a given image, converting our generative model into a highly effective lossy image codec achieving state-of-the-art perceptual image compression results. More generally, by setting other noise selections rules, we extend our compression method to any conditional image generation task (e.g., image restoration), where the generated images are produced jointly with their condensed bit-stream representations. Our work is accompanied by a mathematical interpretation of the proposed compressed conditional generation schemes, establishing a connection with score-based approximations of posterior samplers for the tasks considered.
DDCM can easily be leveraged for perceptual image compression, using any pre-trained DDM.
DDCM can be utilized to solve a variety of compressed conditional image generation tasks, where the generated images are produced jointly with their compressed bit-stream representations.
We use DDCM to solve zero-shot image restoration tasks with noiseless linear degradations, while producing their compressed bit-stream representations. We compare our results with DPS and DDNM, compressed using DDCM for a fair comparison, as well as their original uncompressed outputs.
*Comparisons available in full-screen display.
DDCM can also be used to solve real-world image restoration tasks, again, while automatically producing the compressed bit-stream representations of the restored images. We compare our results against the compressed outputs of several state-of-the-art algorithms: PMRF, DifFace, and BFRffusion. We also compare with the original uncompressed outputs of these methods.
*Comparisons available in full-screen display.
We propose novel compressed classifier-based and classier-free guidance methods (CG & CFG) based on DDCM, as well as a preliminary algorithm for compressed text-based image editing. For both guidance methods demonstrated below, we set K̃=2 and only alter K. Here, K̃ behaves like a guidance scale, and K controls the bitrate.
Class: lorikeet
"Rainbow over the mountains."
"a sculpture of a cat"
→ "a graffiti of a cat"