Upsampling¶
- class Upsampling[source]¶
Create the upsampling module, its role is to upsampling the hierarchical latent variables \(\hat{\mathbf{y}} = \{\hat{\mathbf{y}}_i \in \mathbb{Z}^{C_i \times H_i \times W_i}, i = 0, \ldots, L - 1\}\), where \(L\) is the number of latent resolutions and \(H_i = \frac{H}{2^i}\), \(W_i = \frac{W}{2^i}\) with \(W, H\) the width and height of the image.
The Upsampling transforms this hierarchical latent variable \(\hat{\mathbf{y}}\) into the dense representation \(\hat{\mathbf{z}}\) as follows:
\[\hat{\mathbf{z}} = f_{\upsilon}(\hat{\mathbf{y}}), \text{ with } \hat{\mathbf{z}} \in \mathbb{R}^{C \times H \times W} \text { and } C = \sum_i C_i.\]For a toy example with 3 latent grids (
--n_ft_per_res=1,1,1
), the overall diagram of the upsampling is as follows.+---------+ y0 -> | TConv2d | -----+ +---------+ | v +--------+ +-----+ +---------+ y1 -> | Conv2d | -> | cat | -> | TConv2d | -----+ +--------+ +-----+ +---------+ | v +--------+ +-----+ +---------+ y2 ----------------------------> | Conv2d | -> | cat | -> | TConv2d | -> dense +--------+ +-----+ +---------+
Where
y0
has the smallest resolution,y1
has a resolution double ofy0
etc.There are two different sets of filters:
The TConvs filters actually perform the x2 upsampling. They are referred to as upsampling filters. Implemented using
UpsamplingSeparableSymmetricConvTranspose2d
.The Convs filters pre-process the signal prior to concatenation. They are referred to as pre-concatenation filters. Implemented using
UpsamplingSeparableSymmetricConv2d
.
Kernel sizes for the upsampling and pre-concatenation filters are modified through the
--ups_k_size
and--ups_preconcat_k_size
arguments.Each upsampling filter and each pre-concatenation filter is different. They are all separable and symmetrical.
Upsampling convolutions are initialized with a bilinear or bicubic kernel depending on the required requested
ups_k_size
:If
ups_k_size >= 4 and ups_k_size < 8
, a bilinear kernel (with zero padding if necessary) is used an initialization.If
ups_k_size >= 8
, a bicubic kernel (with zero padding if necessary) is used an initialization.
Pre-concatenation convolutions are initialized with a Dirac kernel.
Warning
The
ups_k_size
must be at least 4 and a multiple of 2.The
ups_preconcat_k_size
must be odd.
- __init__(
- ups_k_size: int,
- ups_preconcat_k_size: int,
- n_ups_kernel: int,
- n_ups_preconcat_kernel: int,
- Parameters:
ups_k_size (int) – Upsampling (TransposedConv) kernel size. Should be even and >= 4.
ups_preconcat_k_size (int) – Pre-concatenation kernel size. Should be odd.
n_ups_kernel (int) – Number of different upsampling kernels. Usually it is set to the number of latent - 1 (because the full resolution latent is not upsampled). But this can also be set to one to share the same kernel across all variables.
n_ups_preconcat_kernel (int) – Number of different pre-concatenation filters. Usually it is set to the number of latent - 1 (because the smallest resolution is not filtered prior to concat). But this can also be set to one to share the same kernel across all variables.
- forward(decoder_side_latent: List[Tensor]) Tensor [source]¶
Upsample a list of \(L\) tensors, where the i-th tensor has a shape \((B, C_i, \frac{H}{2^i}, \frac{W}{2^i})\) to obtain a dense representation \((B, \sum_i C_i, H, W)\). This dense representation is ready to be used as the synthesis input.
- Parameters:
decoder_side_latent (List[Tensor]) – list of \(L\) tensors with various shapes \((B, C_i, \frac{H}{2^i}, \frac{W}{2^i})\)
- Returns:
Dense representation \((B, \sum_i C_i, H, W)\).
- Return type:
Tensor
- get_param() OrderedDict[str, Tensor] [source]¶
Return a copy of the weights and biases inside the module.
- Returns:
A copy of all weights & biases in the layers.
- Return type:
OrderedDict[str, Tensor]
- class UpsamplingSeparableSymmetricConvTranspose2d[source]¶
A TransposedConv2D which has a separable and symmetric even kernel.
Separable means that the 2D-kernel \(\mathbf{w}_{2D}\) can be expressed as the outer product of a 1D kernel \(\mathbf{w}_{1D}\):
\[\mathbf{w}_{2D} = \mathbf{w}_{1D} \otimes \mathbf{w}_{1D}.\]The 1D kernel \(\mathbf{w}_{1D}\) is also symmetric. That is, the 1D kernel is something like \(\mathbf{w}_{1D} = \left(a\ b\ c\ c\ b\ a \right).\)
The symmetric constraint is obtained through the module
_Parameterization_Symmetric_1d
. The separable constraint is obtained by calling twice the 1D kernel.- __init__(kernel_size: int)[source]¶
- Parameters:
kernel_size (int) – Upsampling kernel size. Shall be even and >= 4.
- initialize_parameters() None [source]¶
Initialize the parameters of a
UpsamplingSeparableSymmetricConvTranspose2d
layer.Biases are always set to zero.
Weights are initialize as a (possibly padded) bilinear filter when
target_k_size
is 4 or 6, otherwise a bicubic filter is used.
- Return type:
None
- forward(x: Tensor) Tensor [source]¶
Perform the spatial upsampling (with scale 2) of an input with a single channel. Note that the upsampling filter is both symmetrical and separable. The actual implementation of the forward depends on
self.training
.If we’re training, we use a non-separable implementation. That is, we first compute the 2D kernel through an outer product and then use a single 2D convolution. This is more stable.
If we’re not training, we use two successive 1D convolutions.
- Parameters:
x (Tensor) – Single channel input with shape \((B, 1, H, W)\)
- Returns:
Upsampled version of the input with shape \((B, 1, 2H, 2W)\)
- Return type:
Tensor
- class UpsamplingSeparableSymmetricConv2d[source]¶
A conv2D which has a separable and symmetric odd kernel.
Separable means that the 2D-kernel \(\mathbf{w}_{2D}\) can be expressed as the outer product of a 1D kernel \(\mathbf{w}_{1D}\):
\[\mathbf{w}_{2D} = \mathbf{w}_{1D} \otimes \mathbf{w}_{1D}.\]The 1D kernel \(\mathbf{w}_{1D}\) is also symmetric. That is, the 1D kernel is something like \(\mathbf{w}_{1D} = \left(a\ b\ c\ b\ a \right).\)
The symmetric constraint is obtained through the module
_Parameterization_Symmetric_1d
. The separable constraint is obtained by calling twice the 1D kernel.- __init__(kernel_size: int)[source]¶
- kernel_size: Size of the kernel \(\mathbf{w}_{1D}\) e.g. 7 to
obtain a symmetrical, separable 7x7 filter. Must be odd!
- Parameters:
kernel_size (int)
- initialize_parameters() None [source]¶
Initialize the weights and the biases of the transposed convolution layer performing the upsampling.
Biases are always set to zero.
Weights are set to \((0,\ 0,\ 0,\ \ldots, 1)\) so that when the symmetric reparameterization is applied a Dirac kernel is obtained e.g. \((0,\ 0,\ 0,\ \ldots, 1, \ldots, 0,\ 0,\ 0,)\).
- Return type:
None
- forward(x: Tensor) Tensor [source]¶
Perform a “normal” 2D convolution, except that the underlying kernel is both separable & symmetrical. The actual implementation of the forward depends on
self.training
.If we’re training, we use a non-separable implementation. That is, we first compute the 2D kernel through an outer product and then use a single 2D convolution. This is more stable.
If we’re not training, we use two successive 1D convolutions.
Warning
There is a residual connexion in the forward.
- Parameters:
x (Tensor) – [B, 1, H, W] tensor to be filtered. Must have one only channel.
- Returns:
Filtered tensor [B, 1, H, W].
- Return type:
Tensor