Train¶
- train(
- frame_encoder: FrameEncoder,
- frame: Frame,
- frame_encoder_manager: FrameEncoderManager,
- start_lr: float = 0.01,
- cosine_scheduling_lr: bool = True,
- max_iterations: int = 10000,
- frequency_validation: int = 100,
- patience: int = 10,
- optimized_module: List[Literal['all', 'arm', 'upsampling', 'synthesis', 'latent']] = ['all'],
- quantizer_type: Literal['softround_alone', 'softround', 'hardround', 'ste', 'none'] = 'softround',
- quantizer_noise_type: Literal['kumaraswamy', 'gaussian', 'none'] = 'kumaraswamy',
- softround_temperature: Tuple[float, float] = (0.3, 0.2),
- noise_parameter: Tuple[float, float] = (2.0, 1.0),
Train a
FrameEncoder
and return the updated module. This function is supposed to be called any time we want to optimize the parameters of a FrameEncoder, either during the warm-up (competition of multiple possible initializations) or during of the stages of the actual training phase.The module is optimized according to the following loss function:
\[\begin{split}\mathcal{L} = ||\mathbf{x} - \hat{\mathbf{x}}||^2 + \lambda \mathrm{R}(\hat{\mathbf{x}}), \text{ with } \begin{cases} \mathbf{x} & \text{the original image}\\ \hat{\mathbf{x}} & \text{the coded image}\\ \mathrm{R}(\hat{\mathbf{x}}) & \text{A measure of the rate of } \hat{\mathbf{x}} \end{cases}\end{split}\]Warning
The parameter
frame_encoder_manager
tracking the encoding time of the frame (total_training_time_sec
) and the number of encoding iterations (iterations_counter
) is modified in place by this function.- Parameters:
frame_encoder (FrameEncoder) – Module to be trained.
frame (Frame) – The original image to be compressed and its references.
frame_encoder_manager (FrameEncoderManager) – Contains (among other things) the rate constraint \(\lambda\). It is also used to track the total encoding time and encoding iterations. Modified in place.
start_lr (float) – Initial learning rate. Either constant for the entire training or schedule using a cosine scheduling, see below for more details. Defaults to 1e-2.
cosine_scheduling_lr (bool) – True to schedule the learning rate from
start_lr
at iteration n°0 to 0 at iteration n°max_iterations
. Defaults to True.max_iterations (int) – Do at most
max_iterations
iterations. The actual number of iterations can be made smaller through the patience mechanism. Defaults to 10000.frequency_validation (int) – Check (and print) the performance each
frequency_validation
iterations. This drives the patience mechanism. Defaults to 100.patience (int) – After
patience
iterations without any improvement to the results, exit the training. Patience is disabled by settingpatience = max_iterations
. If patience is used alongside cosine_scheduling_lr, then it does not end the training. Instead, we simply reload the best model so far once we reach the patience, and the training continue. Defaults to 10.optimized_module (List[Literal['all', 'arm', 'upsampling', 'synthesis', 'latent']]) – List of modules to be optimized. Most often you’d want to use
optimized_module = ['all']
. Defaults to['all']
.quantizer_type (Literal['softround_alone', 'softround', 'hardround', 'ste', 'none']) – What quantizer to use during training. See encoder/component/core/quantizer.py for more information. Defaults to
"softround"
.quantizer_noise_type (Literal['kumaraswamy', 'gaussian', 'none']) – The random noise used by the quantizer. More information available in encoder/component/core/quantizer.py. Defaults to
"kumaraswamy"
.softround_temperature (Tuple[float, float]) – The softround temperature is linearly scheduled during the training. At iteration n° 0 it is equal to
softround_temperature[0]
while at iteration n°max_itr
it is equal tosoftround_temperature[1]
. Note that the patience might interrupt the training before it reaches this last value. Defaults to (0.3, 0.2).noise_parameter (Tuple[float, float]) – The random noise temperature is linearly scheduled during the training. At iteration n° 0 it is equal to
noise_parameter[0]
while at iteration n°max_itr
it is equal tonoise_parameter[1]
. Note that the patience might interrupt the training before it reaches this last value. Defaults to (2.0, 1.0).
- Returns:
The trained frame encoder.
- Return type: