Frame Encoder¶

class FrameEncoder[source]¶

A FrameEncoder is the object containing everything required to encode a video frame or an image. It is composed of a CoolChicEncoder and an ĂŚnterCodingModule.

__init__(
coolchic_encoder_param: CoolChicEncoderParameter,
frame_type: Literal['I', 'P', 'B'] = 'I',
frame_data_type: Literal['rgb', 'yuv420', 'yuv444'] = 'rgb',
bitdepth: Literal[8, 10] = 8,
)[source]¶
Parameters:
  • coolchic_encoder_param (CoolChicEncoderParameter) – Parameters for the underlying CoolChicEncoder

  • frame_type (Literal['I', 'P', 'B']) – More info in coding_structure.py. Defaults to “I”.

  • frame_data_type (Literal['rgb', 'yuv420', 'yuv444']) – More info in coding_structure.py. Defaults to “rgb”

  • bitdepth (Literal[8, 10]) – More info in coding_structure.py. Defaults to 8.

forward(
reference_frames: List[Tensor] | None = None,
quantizer_noise_type: Literal['kumaraswamy', 'gaussian', 'none'] = 'kumaraswamy',
quantizer_type: Literal['softround_alone', 'softround', 'hardround', 'ste', 'none'] = 'softround',
soft_round_temperature: float | None = 0.3,
noise_parameter: float | None = 1.0,
AC_MAX_VAL: int = -1,
flag_additional_outputs: bool = False,
) FrameEncoderOutput[source]¶

Perform the entire forward pass of a video frame / image.

  1. Simulate Cool-chic decoding to obtain both the decoded image \(\hat{\mathbf{x}}\) as a \((B, 3, H, W)\) tensor and its associated rate \(\mathrm{R}(\hat{\mathbf{x}})\) as as \((N)\) tensor`, where \(N\) is the number of latent pixels. The rate is given in bits.

  2. Simulate the saving of the image to a file (Optional).

    Only if the model has been set in test mode e.g. self.set_to_eval() . Take into account that \(\hat{\mathbf{x}}\) is a float Tensor, which is gonna be saved as integer values in a file.

    \[\hat{\mathbf{x}}_{saved} = \mathtt{round}(\Delta_q \ \hat{\mathbf{x}}) / \Delta_q, \text{ with } \Delta_q = 2^{bitdepth} - 1\]
  3. Downscale to YUV 420 (Optional). Only if the required output format is YUV420. The current output is a dense Tensor. Downscale the last two channels to obtain a YUV420-like representation. This is done with a nearest neighbor downsampling.

  4. Clamp the output to be in \([0, 1]\).

Parameters:
  • reference_frames (List[Tensor] | None) – List of tensors representing the reference frames. Can be set to None if no reference frame is available. Default to None.

  • quantizer_noise_type (Literal['kumaraswamy', 'gaussian', 'none']) – Defaults to "kumaraswamy".

  • quantizer_type (Literal['softround_alone', 'softround', 'hardround', 'ste', 'none']) – Defaults to "softround".

  • soft_round_temperature (float | None) – Soft round temperature. This is used for softround modes as well as the ste mode to simulate the derivative in the backward. Defaults to 0.3.

  • noise_parameter (float | None) – noise distribution parameter. Defaults to 1.0.

  • AC_MAX_VAL (int) – If different from -1, clamp the value to be in \([-AC\_MAX\_VAL; AC\_MAX\_VAL + 1]\) to write the actual bitstream. Defaults to -1.

  • flag_additional_outputs (bool) – True to fill CoolChicEncoderOutput['additional_data'] with many different quantities which can be used to analyze Cool-chic behavior. Defaults to False.

Returns:

Output of the FrameEncoder for the forward pass.

Return type:

FrameEncoderOutput

get_param() OrderedDict[str, Tensor][source]¶

Return a copy of the weights and biases inside the module.

Returns:

A copy of all weights & biases in the module.

Return type:

OrderedDict[str, Tensor]

set_param(param: OrderedDict[str, Tensor])[source]¶

Replace the current parameters of the module with param.

Parameters:

param (OrderedDict[str, Tensor]) – Parameters to be set.

reinitialize_parameters() None[source]¶

Reinitialize in place the different parameters of a FrameEncoder.

Return type:

None

set_to_train() None[source]¶

Set the current model to training mode, in place. This only affects the quantization.

Return type:

None

set_to_eval() None[source]¶

Set the current model to test mode, in place. This only affects the quantization.

Return type:

None

to_device(device: Literal['cpu', 'cuda:0']) None[source]¶

Push a model to a given device.

Parameters:

device (Literal['cpu', 'cuda:0']) – The device on which the model should run.

Return type:

None

save() BytesIO[source]¶

Save the FrameEncoder into a bytes buffer and return it.

Returns:

Bytes representing the saved coolchic model

Return type:

BytesIO

class FrameEncoderOutput[source]¶

Dataclass representing the output of FrameEncoder forward.

load_frame_encoder(raw_bytes: BytesIO) FrameEncoder[source]¶

From already loaded raw bytes, load & return a CoolChicEncoder

Parameters:

raw_bytes (BytesIO) – Already loaded raw bytes from which we’ll instantiate the CoolChicEncoder.

Returns:

Frame encoder loaded by the function

Return type:

FrameEncoder