Video compression with Cool-chic¶

Cool-chic encodes each video frames successively through the coolchic/encode.py script. Each frame has either 0 reference (intra / plain image) frame, 1 reference (P frame) or 2 references (B-frames).

Coding structure¶

Cool-chic allows to perform all the usual coding configurations (hierarchical random access, low-delay P). It is achieved by specifying:

  • --n_frames: The number of frames to code

  • --intra_pos: Where to place the intra frames

  • --p_pos: Where to place the P frames

All the other frames are hierarchical B-frames with equidistant (past and future) references. Below are a few examples. See coding structure doc for more details.

# A low-delay P configuration
# I0
# \------> P1
#             \-------> P2
#                         \------> P3
#                                     \-------> P4
--n_frames=5 --intra_pos=0 --p_pos=1-4

# A hierarchical Random Access configuration, with a closed GOP
# I0
# \-------------------------------------------------------------------------------------> P8
# \----------------------------------------> B4 <----------------------------------------/
# \-----------------> B2 <------------------/  \------------------> B6 <-----------------/
# \------> B1 <------/  \-------> B3 <------/  \------> B5 <-------/  \------> B7 <------/
--n_frames=8 --intra_pos=0 --p_pos=-1

# A hierarchical Random Access configuration, with an open GOP
# I0                                                                                      I8
# \----------------------------------------> B4 <----------------------------------------/
# \-----------------> B2 <------------------/  \------------------> B6 <-----------------/
# \------> B1 <------/  \-------> B3 <------/  \------> B5 <-------/  \------> B7 <------/
--n_frames=8 --intra_pos=0,-1

# Or some very peculiar structures...
# I0
#   \---------------------------------------------------------------> P6
#   \-----------------------------> B3 <-----------------------------/  \-----------------> P8
#   \------> B1 <------------------/  \------> B4 <------------------/  \------> B7 <------/
#              \------> B2 <-------/             \------> B5 <-------/
--n_frames=8 --intra_pos=0 --p_pos=6,8

Encoding a video frame¶

To encode N frames of a video, it is required to successively call the coolchic/encode.py script N times. Always with the same coding structure parameters while only varying the --coding_idx parameters to indicate which frame is currently being encoded. All the references management logic is handle by Cool-chic, based on the the coding index and the coding structure parameters.

# In the hierarchical Random Access Configuration with open gop listed above
# encode the 3rd (--coding_idx=2) frames in **coding** order: the B4 frame
(venv) ~/Cool-Chic$ python coolchic/encode.py       \
    --input=path_to_my_example                      \
    --output=bitstream.bin                          \
    --workdir=./my_temporary_workdir/               \
    --enc_cfg=cfg/enc/inter/tunable .cfg            \
    --dec_cfg_residue=cfg/dec/intra_residue/mop.cfg \
    --dec_cfg_motion=cfg/dec/motion/lop.cfg         \
    --n_frames=8                                    \
    --intra_pos=0,-1                                \
    --coding_idx=2                                  \
    --n_itr=5000                                    \
    --lmbda=0.01 # Typical range is 1e-2 (low rate) to 1e-4 (high rate)

File naming¶

All the outputs of Cool-chic encoder (logs, compressed frames, trained models, …) in the --workdir are prefixed by the frame display index i.e. 0016-frame_encoder.pt.