Image compression with Cool-chic¶

Encoding your own image is achieved by using the script coolchic/encode.py.

(venv) ~/Cool-Chic$ python coolchic/encode.py       \
    --input=path_to_my_example                      \
    --output=bitstream.bin                          \
    --workdir=./my_temporary_workdir/               \
    --enc_cfg=cfg/enc/intra/fast_10k.cfg            \
    --dec_cfg_residue=cfg/dec/intra_residue/mop.cfg \
    --lmbda=0.001 # Typical range is 1e-2 (low rate) to 1e-4 (high rate)

Unlike the decoding script which only takes input and output arguments, the encoder has many arguments allowing to tune Cool-chic for your need.

Encoder configuration affects the encoding duration by changing the training parameters. This is set through the argument --enc_cfg. Several encoder configuration files are available in cfg/enc/.
Decoder configuration parametrizes the decoder architecture and complexity. This is set through the argument --dec_cfg. Several encoder configuration files are available in cfg/dec/.

Working directory¶

The --workdir argument is used to specify a folder where all necessary data will be stored. This includes the encoder logs and the PyTorch model (workdir/0000-frame_encoder.pt).

Attention

If present, the PyTorch model inside workdir workdir/0000-frame_encoder.pt is reloaded by the coolchic/encode.py script. In order to encode a new image using the same workdir, you must first clean out the workdir.

I/O format¶

Cool-chic is able to encode PPM, PNG, YUV420 & YUV 444 files. The naming of YUV files must comply with the following convention

--input=<videoname>_<Width>x<Height>_<framerate>p_yuv<chromasampling>_<bitdepth>b.yuv

Note that Cool-Chic outputs either PPM (and not PNG!) or YUV files.

Rate constraint¶

The rate constraint --lmbda is used to balance the rate and the distortion when encoding an image. Indeed, Cool-chic parameters are optimized through gradient descent according to the following rate-distortion objective:

\[\begin{split}\mathcal{L} = ||\mathbf{x} - \hat{\mathbf{x}}||^2 + \lambda (\mathrm{R}(\hat{\mathbf{x}})), \text{ with } \begin{cases} \mathbf{x} & \text{the original image}\\ \hat{\mathbf{x}} & \text{the coded image}\\ \mathrm{R}(\hat{\mathbf{x}}) & \text{a measure of the rate of } \hat{\mathbf{x}}\\ \lambda & \text{the rate constraint }\texttt{--lmbda} \end{cases}\end{split}\]