Frame Encoder¶
- class FrameEncoder[source]¶
A
FrameEncoder
is the object containing everything required to encode a video frame or an image. It is composed of one or moreCoolChicEncoder
.- __init__(
- coolchic_enc_param: Dict[Literal['residue', 'motion'], CoolChicEncoderParameter],
- frame_type: Literal['I', 'P', 'B'] = 'I',
- frame_data_type: Literal['rgb', 'yuv420', 'yuv444', 'flow'] = 'rgb',
- bitdepth: Literal[8, 9, 10, 11, 12, 13, 14, 15, 16] = 8,
- index_references: List[int] = [],
- frame_display_index: int = 0,
- Parameters:
coolchic_enc_param (Dict[Literal['residue', 'motion'], ~enc.component.coolchic.CoolChicEncoderParameter]) – Parameters for the underlying CoolChicEncoders
frame_type (Literal['I', 'P', 'B']) – More info in coding_structure.py. Defaults to “I”.
frame_data_type (Literal['rgb', 'yuv420', 'yuv444', 'flow']) – More info in coding_structure.py. Defaults to “rgb”
bitdepth (Literal[8, 9, 10, 11, 12, 13, 14, 15, 16]) – More info in coding_structure.py. Defaults to 8.
index_references (List[int]) – List of the display index of the references. Defaults to []
frame_display_index (int) – display index of the frame being encoded.
- forward(
- reference_frames: List[Tensor] | None = None,
- quantizer_noise_type: Literal['kumaraswamy', 'gaussian', 'none'] = 'kumaraswamy',
- quantizer_type: Literal['softround_alone', 'softround', 'hardround', 'ste', 'none'] = 'softround',
- soft_round_temperature: Tensor | None = tensor(0.3000),
- noise_parameter: Tensor | None = tensor(1.),
- AC_MAX_VAL: int = -1,
- flag_additional_outputs: bool = False,
Perform the entire forward pass of a video frame / image.
Simulate Cool-chic decoding to obtain both the decoded image \(\hat{\mathbf{x}}\) as a \((B, 3, H, W)\) tensor and its associated rate \(\mathrm{R}(\hat{\mathbf{x}})\) as as \((N)\) tensor`, where \(N\) is the number of latent pixels. The rate is given in bits.
- Simulate the saving of the image to a file (Optional).
Only if the model has been set in test mode e.g.
self.set_to_eval()
. Take into account that \(\hat{\mathbf{x}}\) is a float Tensor, which is gonna be saved as integer values in a file.\[\hat{\mathbf{x}}_{saved} = \mathtt{round}(\Delta_q \ \hat{\mathbf{x}}) / \Delta_q, \text{ with } \Delta_q = 2^{bitdepth} - 1\]
Downscale to YUV 420 (Optional). Only if the required output format is YUV420. The current output is a dense Tensor. Downscale the last two channels to obtain a YUV420-like representation. This is done with a nearest neighbor downsampling.
Clamp the output to be in \([0, 1]\).
- Parameters:
reference_frames (List[Tensor] | None) – List of tensors representing the reference frames. Can be set to None if no reference frame is available. Default to None.
quantizer_noise_type (Literal['kumaraswamy', 'gaussian', 'none']) – Defaults to
"kumaraswamy"
.quantizer_type (Literal['softround_alone', 'softround', 'hardround', 'ste', 'none']) – Defaults to
"softround"
.soft_round_temperature (Tensor | None) – Soft round temperature. This is used for softround modes as well as the ste mode to simulate the derivative in the backward. Defaults to 0.3.
noise_parameter (Tensor | None) – noise distribution parameter. Defaults to 1.0.
AC_MAX_VAL (int) – If different from -1, clamp the value to be in \([-AC\_MAX\_VAL; AC\_MAX\_VAL + 1]\) to write the actual bitstream. Defaults to -1.
flag_additional_outputs (bool) – True to fill
CoolChicEncoderOutput['additional_data']
with many different quantities which can be used to analyze Cool-chic behavior. Defaults to False.
- Returns:
Output of the FrameEncoder for the forward pass.
- Return type:
- get_param() OrderedDict[Literal['residue', 'motion'], Tensor] [source]¶
Return a copy of the weights and biases inside the module.
- Returns:
A copy of all weights & biases in the module.
- Return type:
OrderedDict[NAME_COOLCHIC_ENC, Tensor]
- set_param(
- param: OrderedDict[Literal['residue', 'motion'], Tensor],
Replace the current parameters of the module with param.
- Parameters:
param (
OrderedDict[NAME_COOLCHIC_ENC
,Tensor]
) – Parameters to be set.
- reinitialize_parameters() None [source]¶
Reinitialize in place the different parameters of a FrameEncoder.
- Return type:
None
- set_to_train() None [source]¶
Set the current model to training mode, in place. This only affects the quantization.
- Return type:
None
- set_global_flow(global_flow_1: Tensor, global_flow_2: Tensor) None [source]¶
Set the value of the global flows.
The global flows are 2-element tensors. The first one is the horizontal displacement and the second one the vertical displacement.
- Parameters:
global_flow_1 (
Tensor
) – Value of global flow for reference 1. Must have 2 elements.global_flow_2 (
Tensor
) – Value of global flow for reference 2. Must have 2 elements.
- Return type:
None
- get_network_rate() Tuple[Dict[Literal['residue', 'motion'], DescriptorCoolChic], int] [source]¶
Return the rate (in bits) associated to the parameters (weights and biases) of the different modules
- Returns:
The rate (in bits) associated with the weights and biases of each module of each cool-chic decoder. Also return the overall rate in bits.
- Return type:
Tuple[Dict[NAME_COOLCHIC_ENC, DescriptorCoolChic], int]
- get_network_quantization_step() Dict[Literal['residue', 'motion'], DescriptorCoolChic] [source]¶
Return the quantization step associated to the parameters (weights and biases) of the different modules of each cool-chic decoder. Those quantization can be
None
if the model has not yet been quantized.E.g. {“residue”: {“arm”: 4, “upsampling”: 12, “synthesis”: 1}}
- Returns:
The quantization step associated with the weights and biases of each module of each cool-chic decoder.
- Return type:
Dict[NAME_COOLCHIC_ENC, DescriptorCoolChic]
- get_network_expgol_count() Dict[Literal['residue', 'motion'], DescriptorCoolChic] [source]¶
Return the Exp-Golomb count parameter associated to the parameters (weights and biases) of the different modules of each cool-chic decoder. Those exp-golomb param can be
None
if the model has not yet been quantized.E.g. {“residue”: {“arm”: 4, “upsampling”: 12, “synthesis”: 1}}
- Returns:
The exp-golomb count parameter associated with the weights and biases of each module of each cool-chic decoder.
- Return type:
Dict[NAME_COOLCHIC_ENC, DescriptorCoolChic]
- get_total_mac_per_pixel() float [source]¶
Count the number of Multiplication-Accumulation (MAC) per decoded pixel for this model.
- Returns:
number of floating point operations per decoded pixel.
- Return type:
float
- set_to_eval() None [source]¶
Set the current model to test mode, in place. This only affects the quantization.
- Return type:
None
- to_device(device: Literal['cpu', 'cuda:0']) None [source]¶
Push a model to a given device.
- Parameters:
device (Literal['cpu', 'cuda:0']) – The device on which the model should run.
- Return type:
None
- save(
- path_file: str,
- frame_encoder_manager: FrameEncoderManager | None = None,
- Save the FrameEncoder into a bytes buffer and return it.
Optionally save a frame_encoder_manager alongside the current frame encoder to keep track of the training time, record loss etc.
- Parameters:
path_file (str) – Where to save the FrameEncoder
frame_encoder_manager (FrameEncoderManager | None) – Contains (among other things) the rate constraint \(\lambda\) and description of the warm-up preset. It is also used to track the total encoding time and encoding iterations.
Returns – Bytes representing the saved coolchic model
- Return type:
None
- pretty_string(print_detailed_archi: bool = False) str [source]¶
Get a pretty string representing the architectures of the different
CoolChicEncoder
composing the currentFrameEncoder
.- Parameters:
print_detailed_archi (bool) – True to print the detailed decoder architecture
- Returns:
a pretty string ready to be printed out
- Return type:
str
- load_frame_encoder(
- path_file: str,
From already loaded raw bytes, load & return a CoolChicEncoder
- Parameters:
path_file (str) – Path of the FrameEncoder to be loaded
- Returns:
Tuple with a FrameEncoder loaded by the function and an optional FrameEncoderManager
- Return type:
Tuple[FrameEncoder, FrameEncoderManager | None]