Train¶

train( frame_encoder: FrameEncoder, frame: Frame, frame_encoder_manager: FrameEncoderManager, lmbda: float = 0.001, start_lr: float = 0.01, cosine_scheduling_lr: bool = True, max_iterations: int = 10000, frequency_validation: int = 100, patience: int = 10, optimized_module: List[Literal['residue.all', 'residue.arm', 'residue.upsampling', 'residue.synthesis', 'residue.latent', 'motion.all', 'motion.arm', 'motion.upsampling', 'motion.synthesis', 'motion.latent', 'all.all', 'all.arm', 'all.upsampling', 'all.synthesis', 'all.latent', 'all']] = ['all'], quantizer_type: Literal['softround_alone', 'softround', 'hardround', 'ste', 'none'] = 'softround', quantizer_noise_type: Literal['kumaraswamy', 'gaussian', 'none'] = 'kumaraswamy', softround_temperature: Tuple[float, float] = (0.3, 0.2), noise_parameter: Tuple[float, float] = (2.0, 1.0), ) → FrameEncoder[source]¶

Train a FrameEncoder and return the updated module. This function is supposed to be called any time we want to optimize the parameters of a FrameEncoder, either during the warm-up (competition of multiple possible initializations) or during of the stages of the actual training phase.

The module is optimized according to the following loss function:

\[\begin{split}\mathcal{L} = ||\mathbf{x} - \hat{\mathbf{x}}||^2 + \lambda \mathrm{R}(\hat{\mathbf{x}}), \text{ with } \begin{cases} \mathbf{x} & \text{the original image}\\ \hat{\mathbf{x}} & \text{the coded image}\\ \mathrm{R}(\hat{\mathbf{x}}) & \text{A measure of the rate of } \hat{\mathbf{x}} \end{cases}\end{split}\]

Warning

The parameter frame_encoder_manager tracking the encoding time of the frame (total_training_time_sec) and the number of encoding iterations (iterations_counter) is modified in place by this function.

Parameters:

frame_encoder (FrameEncoder) – Module to be trained.
frame (Frame) – The original image to be compressed and its references.
frame_encoder_manager (FrameEncoderManager) – Contains (among other things) the rate constraint \(\lambda\). It is also used to track the total encoding time and encoding iterations. Modified in place.
start_lr (float) – Initial learning rate. Either constant for the entire training or schedule using a cosine scheduling, see below for more details. Defaults to 1e-2.
cosine_scheduling_lr (bool) – True to schedule the learning rate from start_lr at iteration n°0 to 0 at iteration n° max_iterations. Defaults to True.
max_iterations (int) – Do at most max_iterations iterations. The actual number of iterations can be made smaller through the patience mechanism. Defaults to 10000.
frequency_validation (int) – Check (and print) the performance each frequency_validation iterations. This drives the patience mechanism. Defaults to 100.
patience (int) – After patience iterations without any improvement to the results, exit the training. Patience is disabled by setting patience = max_iterations. If patience is used alongside cosine_scheduling_lr, then it does not end the training. Instead, we simply reload the best model so far once we reach the patience, and the training continue. Defaults to 10.
optimized_module (List[Literal['residue.all', 'residue.arm', 'residue.upsampling', 'residue.synthesis', 'residue.latent', 'motion.all', 'motion.arm', 'motion.upsampling', 'motion.synthesis', 'motion.latent', 'all.all', 'all.arm', 'all.upsampling', 'all.synthesis', 'all.latent', 'all']]) – List of modules to be optimized. Most often you’d want to use optimized_module = ['all']. Defaults to ['all'].
quantizer_type (Literal['softround_alone', 'softround', 'hardround', 'ste', 'none']) – What quantizer to use during training. See encoder/component/core/quantizer.py for more information. Defaults to "softround".
quantizer_noise_type (Literal['kumaraswamy', 'gaussian', 'none']) – The random noise used by the quantizer. More information available in encoder/component/core/quantizer.py. Defaults to "kumaraswamy".
softround_temperature (Tuple[float, float]) – The softround temperature is linearly scheduled during the training. At iteration n° 0 it is equal to softround_temperature[0] while at iteration n° max_itr it is equal to softround_temperature[1]. Note that the patience might interrupt the training before it reaches this last value. Defaults to (0.3, 0.2).
noise_parameter (Tuple[float, float]) – The random noise temperature is linearly scheduled during the training. At iteration n° 0 it is equal to noise_parameter[0] while at iteration n° max_itr it is equal to noise_parameter[1]. Note that the patience might interrupt the training before it reaches this last value. Defaults to (2.0, 1.0).
lmbda (float)

Returns:

The trained frame encoder.

Return type:

FrameEncoder