YUV FormatΒΆ
- class DictTensorYUV[source]ΒΆ
TypedDict
representing a YUV420 frame..Hint
torch.jit
requires I/O of modules to be eitherTensor
,List
orDict
. So we donβt use a python dataclass here and rely onTypedDict
instead.- Parameters:
y (
Tensor
) β Tensor with shape \(([B, 1, H, W])\).u (
Tensor
) β Tensor with shape \(([B, 1, \frac{H}{2}, \frac{W}{2}])\).v (
Tensor
) β Tensor with shape \(([B, 1, \frac{H}{2}, \frac{W}{2}])\).
- read_yuv(
- file_path: str,
- frame_idx: int,
- frame_data_type: Literal['rgb', 'yuv420', 'yuv444'],
- bit_depth: Literal[8, 9, 10, 11, 12, 13, 14, 15, 16],
From a file_path /a/b/c.yuv, read the desired frame_index and return a dictionary of tensor containing the YUV values:
{ 'Y': [1, 1, H, W], 'U': [1, 1, H / S, W / S], 'V': [1, 1, H / S, W / S], }
S
is either 1 (444 sampling) or 2 (420). The YUV values are in [0., 1.]- Parameters:
file_path (str) β Absolute path of the video to load
frame_idx (int) β Index of the frame to load, starting at 0.
frame_data_type (Literal['rgb', 'yuv420', 'yuv444']) β chroma sampling (420,444)
depth (bit) β Number of bits per component (8 or 10 bits).
bit_depth (Literal[8, 9, 10, 11, 12, 13, 14, 15, 16])
- Returns:
For 420, return a dict of tensors with YUV values of shape [1, 1, H, W]. For 444 return a [1, 3, H, W] tensor.
- Return type:
DictTensorYUV | Tensor
- write_yuv(
- data: Tensor | DictTensorYUV,
- bitdepth: Literal[8, 9, 10, 11, 12, 13, 14, 15, 16],
- frame_data_type: Literal['rgb', 'yuv420', 'yuv444'],
- file_path: str,
- norm: bool = True,
Store a YUV frame as a YUV file named file_path. They are appended to the end of the file_path If norm is True: the video data is expected to be in [0., 1.] so we multiply it by 255. Otherwise we let it as is.
- Parameters:
data (Tensor | DictTensorYUV) β Data to save
bitdepth (Literal[8, 9, 10, 11, 12, 13, 14, 15, 16]) β Bitdepth, should be in``[8, 9, 10, 11, 12, 13, 14, 15, 16]``.
frame_data_type (Literal['rgb', 'yuv420', 'yuv444']) β Data type, either
"yuv420"
or"yuv444"
.file_path (str) β Absolute path of the file where the YUV is saved.
norm (bool) β True to multiply the data by 2 ** bitdepth - 1. Defaults to True.
- Return type:
None
- rgb2yuv(rgb: Tensor) Tensor [source]ΒΆ
Convert a 4D RGB tensor [1, 3, H, W] into a 4D YUV444 tensor [1, 3, H, W]. The RGB and YUV values are in the range [0, 255]
- Parameters:
rgb (Tensor) β 4D RGB tensor to convert in [0. 255.]
- Returns:
The resulting YUV444 tensor in [0. 255.]
- Return type:
Tensor
- yuv2rgb(yuv: Tensor)[source]ΒΆ
Convert a 4D YUV tensor [1, 3, H, W] into a 4D RGB tensor [1, 3, H, W]. The RGB and YUV values are in the range [0, 255]
- Parameters:
rgb β 4D YUV444 tensor to convert in [0. 255.]
yuv (Tensor)
- Returns:
The resulting RGB tensor in [0. 255.]
- yuv_dict_clamp(
- yuv: DictTensorYUV,
- min_val: float,
- max_val: float,
Clamp the y, u & v tensor.
- Parameters:
yuv (DictTensorYUV) β The data to clamp
min_val (float) β Minimum value for the clamp
max_val (float) β Maximum value for the clamp
- Returns:
The clamped data
- Return type:
- yuv_dict_to_device(
- yuv: DictTensorYUV,
- device: Literal['cpu', 'cuda:0'],
Send a
DictTensor
to a device.- Parameters:
yuv (DictTensorYUV) β Data to be sent to a device.
device (Literal['cpu', 'cuda:0']) β The requested device
- Returns:
Data on the appropriate device.
- Return type:
- convert_444_to_420(yuv444: Tensor) DictTensorYUV [source]ΒΆ
From a 4D YUV 444 tensor \((B, 3, H, W)\), return a
DictTensorYUV
. The U and V tensors are down sampled using a nearest neighbor downsampling.- Parameters:
yuv444 (Tensor) β YUV444 data \((B, 3, H, W)\)
- Returns:
YUV420 dictionary of 4D tensors
- Return type:
- convert_420_to_444(yuv420: DictTensorYUV) Tensor [source]ΒΆ
Convert a DictTensorYUV to a 4D tensor:math:(B, 3, H, W). The U and V tensors are up sampled using a nearest neighbor upsampling
- Parameters:
yuv420 (DictTensorYUV) β YUV420 dictionary of 4D tensor
- Returns:
YUV444 Tensor \((B, 3, H, W)\)
- Return type:
Tensor