如何在PyTorch中应用二维转置卷积运算?
我们可以使用**torch.nn.ConvTranspose2d()**模块对包含多个输入平面的输入图像应用二维转置卷积运算。此模块可以看作是**Conv2d**关于其输入的梯度。
二维转置卷积层的输入大小必须为**[N,C,H,W]**,其中**N**是批大小,**C**是通道数,**H**和**W**分别是输入图像的高度和宽度。
通常,二维转置卷积运算应用于图像张量。对于RGB图像,通道数为3。转置卷积运算的主要特征是滤波器或内核大小和步幅。此模块支持**TensorFloat32**。
语法
torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
参数
**in_channels** – 输入图像中的通道数。
**out_channels** – 转置卷积运算产生的通道数。
**kernel_size** – 卷积核的大小。
除了以上三个参数外,还有一些可选参数,例如**stride、padding、dilation**等。我们将在下面的Python示例中详细介绍这些参数。
步骤
您可以使用以下步骤应用二维转置卷积运算:
导入所需的库。在以下所有示例中,所需的Python库是**torch**。确保您已经安装它。要在图像上应用二维转置卷积运算,我们还需要**torchvision**和**Pillow**。
import torch import torchvision from PIL import Image
定义**输入**张量或读取输入图像。如果输入是图像,则我们首先将其转换为torch张量。
定义**in_channels、out_channels、kernel_size**和其他参数。
接下来,通过将上述定义的参数传递给**torch.nn.ConvTranspose2d()**来定义转置卷积运算convt。
convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
将转置卷积运算convt应用于输入张量或图像张量。
output = convt(input)
接下来打印转置卷积运算后的张量。如果输入是图像张量,则要可视化图像,我们首先将转置卷积运算后获得的张量转换为PIL图像,然后可视化图像。
让我们来看一些示例,以便更清楚地理解。
输入图像
我们将在示例2中使用以下图像作为输入文件。
示例1
在下面的Python示例中,我们对输入张量执行二维转置卷积运算。我们应用**kernel_size、stride、padding**和**dilation**的不同组合。
# Python 3 program to perform 2D transpose convolution operation import torch import torch.nn as nn '''torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0) ''' in_channels = 2 out_channels = 3 kernel_size = 2 convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size) # conv = nn.ConvTranspose2d(3, 6, 2) '''input of size [N,C,H, W] N==>batch size, C==> number of channels, H==> height of input planes in pixels, W==> width in pixels. ''' # define the input with below info N=1 C=2 H=4 W=4 input = torch.empty(N,C,H,W).random_(256) # input = torch.randn(2,3,32,64) print("Input Tensor:", input) print("Input Size:",input.size()) # Perform transpose convolution operation output = convt(input) print("Output Tensor:", output) print("Output Size:",output.size()) # With square kernels (3,3) and equal stride convt = nn.ConvTranspose2d(2, 3, 3, stride=2) output = convt(input) print("Output Size:",output.size()) # non-square kernels and unequal stride and with padding convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2)) output = convt(input) print("Output Size:",output.size()) # non-square kernels and unequal stride and with padding and dilation convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1)) output = convt(input) print("Output Size:",output.size())
输出
Input Tensor: tensor([[[[115., 76., 102., 6.], [221., 173., 23., 205.], [123., 23., 112., 18.], [189., 178., 167., 143.]], [[239., 180., 226., 88.], [224., 30., 196., 224.], [ 57., 222., 47., 84.], [ 25., 255., 201., 114.]]]]) Input Size: torch.Size([1, 2, 4, 4]) Output Tensor: tensor([[[[ 48.1156, 64.6112, 64.9630, 47.2604, 3.9925], [74.9169, 80.7055, 138.8992, 82.8471, 54.3722], [20.0938, 49.5610, 30.2914, 93.3563, 3.1597], [-27.1410, 118.8138, 92.8670, 50.6170, 37.5564], [-27.7676, 6.5762, 33.6408, 6.7176, -8.8372]], [[ -18.2188, -56.5362, -49.8063, -43.3336, -16.8645], [ -23.4012, -6.1607, 40.5064, -17.4547, -25.1738], [ -5.7752, 53.6838, -27.9412, 36.7660, 44.0866], [ -23.5205, 1.1443, -29.0826, -34.7213, -4.1535], [ 5.6746, 38.4026, 72.8414, 59.2990, 34.9241]], [[ -35.0380, -31.4031, -38.0059, -19.3247, -5.6272], [-109.2401, -12.9763, -62.2776, -31.0825, 19.2766], [ -93.6596, -18.5403, -67.5457, -61.8533, 32.3005], [ -27.7020, -71.3938, -18.9532, -26.8304, 20.0184], [ -29.2334, -85.8179, -35.4292, -16.4065, 19.0788]]]], grad_fn=<SlowConvTranspose2DBackward>) Output Size: torch.Size([1, 3, 5, 5]) Output Size: torch.Size([1, 3, 9, 9]) Output Size: torch.Size([1, 3, 1, 4]) Output Size: torch.Size([1, 3, 5, 4])
Explore our latest online courses and learn new skills at your own pace. Enroll and become a certified expert to boost your career.
示例2
在下面的Python示例中,我们对输入图像执行二维转置卷积运算。为了应用二维转置卷积,我们首先将图像转换为torch张量,并在转置卷积之后,再次将其转换为PIL图像以进行可视化。
# Python program to perform 2D transpose convolution operation # Import the required libraries import torch import torchvision from PIL import Image import torchvision.transforms as T # Read input image img = Image.open('car.jpg') # convert the input image to torch tensor img = T.ToTensor()(img) print("Input image size:", img.size()) # size = [3, 464, 700] # unsqueeze the image to make it 4D tensor img = img.unsqueeze(0) # image size = [1, 3, 464, 700] # define transpose convolution layer # convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size) convt = torch.nn.ConvTranspose2d(3, 3, 2) # apply transpose convolution operation on image img = convt(img) # squeeze image to make it 3D img = img.squeeze(0) # now image is again 3D print("Output image size:",img.size()) # convert image to PIL image img = T.ToPILImage()(img) # display the image after convolution img.show() ''' Note: You may get different output image after the convolution operation because the weights initialized may be different at different runs. '''
输出
Input image size: torch.Size([3, 464, 700]) Output image size: torch.Size([3, 465, 701])
请注意,由于**权重**和**偏差**的初始化,您可能会在每次运行后看到获得的图像的一些变化。