Frame Encoder¶

class FrameEncoder[source]¶

A FrameEncoder is the object containing everything required to encode a video frame or an image. It is composed of one or more CoolChicEncoder.

__init__(
coolchic_enc_param: Dict[Literal['residue', 'motion'], CoolChicEncoderParameter],
frame_type: Literal['I', 'P', 'B'] = 'I',
frame_data_type: Literal['rgb', 'yuv420', 'yuv444', 'flow'] = 'rgb',
bitdepth: Literal[8, 9, 10, 11, 12, 13, 14, 15, 16] = 8,
index_references: List[int] = [],
frame_display_index: int = 0,
)[source]¶
Parameters:
  • coolchic_enc_param (Dict[Literal['residue', 'motion'], ~enc.component.coolchic.CoolChicEncoderParameter]) – Parameters for the underlying CoolChicEncoders

  • frame_type (Literal['I', 'P', 'B']) – More info in coding_structure.py. Defaults to “I”.

  • frame_data_type (Literal['rgb', 'yuv420', 'yuv444', 'flow']) – More info in coding_structure.py. Defaults to “rgb”

  • bitdepth (Literal[8, 9, 10, 11, 12, 13, 14, 15, 16]) – More info in coding_structure.py. Defaults to 8.

  • index_references (List[int]) – List of the display index of the references. Defaults to []

  • frame_display_index (int) – display index of the frame being encoded.

forward(
reference_frames: List[Tensor] | None = None,
quantizer_noise_type: Literal['kumaraswamy', 'gaussian', 'none'] = 'kumaraswamy',
quantizer_type: Literal['softround_alone', 'softround', 'hardround', 'ste', 'none'] = 'softround',
soft_round_temperature: Tensor | None = tensor(0.3000),
noise_parameter: Tensor | None = tensor(1.),
AC_MAX_VAL: int = -1,
flag_additional_outputs: bool = False,
) FrameEncoderOutput[source]¶

Perform the entire forward pass of a video frame / image.

  1. Simulate Cool-chic decoding to obtain both the decoded image \(\hat{\mathbf{x}}\) as a \((B, 3, H, W)\) tensor and its associated rate \(\mathrm{R}(\hat{\mathbf{x}})\) as as \((N)\) tensor`, where \(N\) is the number of latent pixels. The rate is given in bits.

  2. Simulate the saving of the image to a file (Optional).

    Only if the model has been set in test mode e.g. self.set_to_eval() . Take into account that \(\hat{\mathbf{x}}\) is a float Tensor, which is gonna be saved as integer values in a file.

    \[\hat{\mathbf{x}}_{saved} = \mathtt{round}(\Delta_q \ \hat{\mathbf{x}}) / \Delta_q, \text{ with } \Delta_q = 2^{bitdepth} - 1\]
  3. Downscale to YUV 420 (Optional). Only if the required output format is YUV420. The current output is a dense Tensor. Downscale the last two channels to obtain a YUV420-like representation. This is done with a nearest neighbor downsampling.

  4. Clamp the output to be in \([0, 1]\).

Parameters:
  • reference_frames (List[Tensor] | None) – List of tensors representing the reference frames. Can be set to None if no reference frame is available. Default to None.

  • quantizer_noise_type (Literal['kumaraswamy', 'gaussian', 'none']) – Defaults to "kumaraswamy".

  • quantizer_type (Literal['softround_alone', 'softround', 'hardround', 'ste', 'none']) – Defaults to "softround".

  • soft_round_temperature (Tensor | None) – Soft round temperature. This is used for softround modes as well as the ste mode to simulate the derivative in the backward. Defaults to 0.3.

  • noise_parameter (Tensor | None) – noise distribution parameter. Defaults to 1.0.

  • AC_MAX_VAL (int) – If different from -1, clamp the value to be in \([-AC\_MAX\_VAL; AC\_MAX\_VAL + 1]\) to write the actual bitstream. Defaults to -1.

  • flag_additional_outputs (bool) – True to fill CoolChicEncoderOutput['additional_data'] with many different quantities which can be used to analyze Cool-chic behavior. Defaults to False.

Returns:

Output of the FrameEncoder for the forward pass.

Return type:

FrameEncoderOutput

get_param() OrderedDict[Literal['residue', 'motion'], Tensor][source]¶

Return a copy of the weights and biases inside the module.

Returns:

A copy of all weights & biases in the module.

Return type:

OrderedDict[NAME_COOLCHIC_ENC, Tensor]

set_param(
param: OrderedDict[Literal['residue', 'motion'], Tensor],
)[source]¶

Replace the current parameters of the module with param.

Parameters:

param (OrderedDict[NAME_COOLCHIC_ENC, Tensor]) – Parameters to be set.

reinitialize_parameters() None[source]¶

Reinitialize in place the different parameters of a FrameEncoder.

Return type:

None

set_to_train() None[source]¶

Set the current model to training mode, in place. This only affects the quantization.

Return type:

None

set_global_flow(global_flow_1: Tensor, global_flow_2: Tensor) None[source]¶

Set the value of the global flows.

The global flows are 2-element tensors. The first one is the horizontal displacement and the second one the vertical displacement.

Parameters:
  • global_flow_1 (Tensor) – Value of global flow for reference 1. Must have 2 elements.

  • global_flow_2 (Tensor) – Value of global flow for reference 2. Must have 2 elements.

Return type:

None

get_network_rate() Tuple[Dict[Literal['residue', 'motion'], DescriptorCoolChic], int][source]¶

Return the rate (in bits) associated to the parameters (weights and biases) of the different modules

Returns:

The rate (in bits) associated with the weights and biases of each module of each cool-chic decoder. Also return the overall rate in bits.

Return type:

Tuple[Dict[NAME_COOLCHIC_ENC, DescriptorCoolChic], int]

get_network_quantization_step() Dict[Literal['residue', 'motion'], DescriptorCoolChic][source]¶

Return the quantization step associated to the parameters (weights and biases) of the different modules of each cool-chic decoder. Those quantization can be None if the model has not yet been quantized.

E.g. {“residue”: {“arm”: 4, “upsampling”: 12, “synthesis”: 1}}

Returns:

The quantization step associated with the weights and biases of each module of each cool-chic decoder.

Return type:

Dict[NAME_COOLCHIC_ENC, DescriptorCoolChic]

get_network_expgol_count() Dict[Literal['residue', 'motion'], DescriptorCoolChic][source]¶

Return the Exp-Golomb count parameter associated to the parameters (weights and biases) of the different modules of each cool-chic decoder. Those exp-golomb param can be None if the model has not yet been quantized.

E.g. {“residue”: {“arm”: 4, “upsampling”: 12, “synthesis”: 1}}

Returns:

The exp-golomb count parameter associated with the weights and biases of each module of each cool-chic decoder.

Return type:

Dict[NAME_COOLCHIC_ENC, DescriptorCoolChic]

get_total_mac_per_pixel() float[source]¶

Count the number of Multiplication-Accumulation (MAC) per decoded pixel for this model.

Returns:

number of floating point operations per decoded pixel.

Return type:

float

set_to_eval() None[source]¶

Set the current model to test mode, in place. This only affects the quantization.

Return type:

None

to_device(device: Literal['cpu', 'cuda:0']) None[source]¶

Push a model to a given device.

Parameters:

device (Literal['cpu', 'cuda:0']) – The device on which the model should run.

Return type:

None

save(
path_file: str,
frame_encoder_manager: FrameEncoderManager | None = None,
) None[source]¶
Save the FrameEncoder into a bytes buffer and return it.

Optionally save a frame_encoder_manager alongside the current frame encoder to keep track of the training time, record loss etc.

Parameters:
  • path_file (str) – Where to save the FrameEncoder

  • frame_encoder_manager (FrameEncoderManager | None) – Contains (among other things) the rate constraint \(\lambda\) and description of the warm-up preset. It is also used to track the total encoding time and encoding iterations.

  • Returns – Bytes representing the saved coolchic model

Return type:

None

pretty_string(print_detailed_archi: bool = False) str[source]¶

Get a pretty string representing the architectures of the different CoolChicEncoder composing the current FrameEncoder.

Parameters:

print_detailed_archi (bool) – True to print the detailed decoder architecture

Returns:

a pretty string ready to be printed out

Return type:

str

pretty_string_param() str[source]¶

Get a pretty string representing the parameters of the different CoolChicEncoderParameters parameterising the current FrameEncoder

Return type:

str

class FrameEncoderOutput[source]¶

Dataclass representing the output of FrameEncoder forward.

load_frame_encoder(
path_file: str,
) Tuple[FrameEncoder, FrameEncoderManager | None][source]¶

From already loaded raw bytes, load & return a CoolChicEncoder

Parameters:

path_file (str) – Path of the FrameEncoder to be loaded

Returns:

Tuple with a FrameEncoder loaded by the function and an optional FrameEncoderManager

Return type:

Tuple[FrameEncoder, FrameEncoderManager | None]