Upsampling¶
- class Upsampling[source]¶
Create the upsampling module, its role is to upsampling the hierarchical latent variables \(\hat{\mathbf{y}} = \{\hat{\mathbf{y}}_i \in \mathbb{Z}^{C_i \times H_i \times W_i}, i = 0, \ldots, L - 1\}\), where \(L\) is the number of latent resolutions and \(H_i = \frac{H}{2^i}\), \(W_i = \frac{W}{2^i}\) with \(W, H\) the width and height of the image.
The Upsampling transforms this hierarchical latent variable \(\hat{\mathbf{y}}\) into the dense representation \(\hat{\mathbf{z}}\) as follows:
\[\hat{\mathbf{z}} = f_{\upsilon}(\hat{\mathbf{y}}), \text{ with } \hat{\mathbf{z}} \in \mathbb{R}^{C \times H \times W} \text { and } C = \sum_i C_i.\]The upsampling relies on a single custom transpose convolution
UpsamplingConvTranspose2d
performing a 2x upsampling of a 1-channel input. This transpose convolution is called over and over to upsampling each channel of each resolution until they reach the required \(H \times W\) dimensions.The kernel of the
UpsamplingConvTranspose2d
depending on the value of the flagstatic_upsampling_kernel
. In either case, the kernel initialization is based on well-known bilinear or bicubic kernel depending on the requestedupsampling_kernel_size
:If
upsampling_kernel_size >= 4 and upsampling_kernel_size < 8
, a bilinear kernel (with zero padding if necessary) is used an initialization.If
upsampling_kernel_size >= 8
, a bicubic kernel (with zero padding if necessary) is used an initialization.
Warning
The
upsampling_kernel_size
must be at least 4 and a multiple of 2.- __init__(upsampling_kernel_size: int, static_upsampling_kernel: bool)[source]¶
- Parameters:
upsampling_kernel_size (int) – Upsampling kernel size. Should be bigger or equal to 4 and a multiple of two.
static_upsampling_kernel (bool) – If true, don’t learn the upsampling kernel.
- forward(decoder_side_latent: List[Tensor]) Tensor [source]¶
Upsample a list of \(L\) tensors, where the i-th tensor has a shape \((B, C_i, \frac{H}{2^i}, \frac{W}{2^i})\) to obtain a dense representation \((B, \sum_i C_i, H, W)\). This dense representation is ready to be used as the synthesis input.
- Parameters:
decoder_side_latent (List[Tensor]) – list of \(L\) tensors with various shapes \((B, C_i, \frac{H}{2^i}, \frac{W}{2^i})\)
- Returns:
Dense representation \((B, \sum_i C_i, H, W)\).
- Return type:
Tensor
- get_param() OrderedDict[str, Tensor] [source]¶
Return a copy of the weights and biases inside the module.
- Returns:
A copy of all weights & biases in the layers.
- Return type:
OrderedDict[str, Tensor]
- class UpsamplingConvTranspose2d[source]¶
Wrapper around the usual
nn.TransposeConv2d
layer. It performs a 2x upsampling of a latent variable with a single input and output channel. It can be learned or not, depending on the flagstatic_upsampling_kernel
. Its initialization depends on the requested kernel size. If the kernel size is 4 or 6, we use the bilinear kernel with zero padding if necessary. Otherwise, if the kernel size is 8 or bigger, we rely on the bicubic kernel.- __init__(upsampling_kernel_size: int, static_upsampling_kernel: bool)[source]¶
- Parameters:
upsampling_kernel_size (int) – Upsampling kernel size. Should be >= 4 and a multiple of two.
static_upsampling_kernel (bool) – If true, don’t learn the upsampling kernel.
- initialize_parameters() None [source]¶
Initialize **in-place ** the weights and the biases of the transposed convolution layer performing the upsampling.
Biases are always set to zero.
Weights are set to a (padded) bicubic kernel if kernel size is at least 8. If kernel size is greater than or equal to 4, weights are set to a (padded) bilinear kernel.
- Return type:
None