如何在PyTorch中应用二维转置卷积运算?
我们可以使用**torch.nn.ConvTranspose2d()**模块对包含多个输入平面的输入图像应用二维转置卷积运算。此模块可以看作是**Conv2d**关于其输入的梯度。
二维转置卷积层的输入大小必须为**[N,C,H,W]**,其中**N**是批大小,**C**是通道数,**H**和**W**分别是输入图像的高度和宽度。
通常,二维转置卷积运算应用于图像张量。对于RGB图像,通道数为3。转置卷积运算的主要特征是滤波器或内核大小和步幅。此模块支持**TensorFloat32**。
语法
torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
参数
**in_channels** – 输入图像中的通道数。
**out_channels** – 转置卷积运算产生的通道数。
**kernel_size** – 卷积核的大小。
除了以上三个参数外,还有一些可选参数,例如**stride、padding、dilation**等。我们将在下面的Python示例中详细介绍这些参数。
步骤
您可以使用以下步骤应用二维转置卷积运算:
导入所需的库。在以下所有示例中,所需的Python库是**torch**。确保您已经安装它。要在图像上应用二维转置卷积运算,我们还需要**torchvision**和**Pillow**。
import torch import torchvision from PIL import Image
定义**输入**张量或读取输入图像。如果输入是图像,则我们首先将其转换为torch张量。
定义**in_channels、out_channels、kernel_size**和其他参数。
接下来,通过将上述定义的参数传递给**torch.nn.ConvTranspose2d()**来定义转置卷积运算convt。
convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
将转置卷积运算convt应用于输入张量或图像张量。
output = convt(input)
接下来打印转置卷积运算后的张量。如果输入是图像张量,则要可视化图像,我们首先将转置卷积运算后获得的张量转换为PIL图像,然后可视化图像。
让我们来看一些示例,以便更清楚地理解。
输入图像
我们将在示例2中使用以下图像作为输入文件。

示例1
在下面的Python示例中,我们对输入张量执行二维转置卷积运算。我们应用**kernel_size、stride、padding**和**dilation**的不同组合。
# Python 3 program to perform 2D transpose convolution operation
import torch
import torch.nn as nn
'''torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0)
'''
in_channels = 2
out_channels = 3
kernel_size = 2
convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
# conv = nn.ConvTranspose2d(3, 6, 2)
'''input of size [N,C,H, W]
N==>batch size,
C==> number of channels,
H==> height of input planes in pixels,
W==> width in pixels.
'''
# define the input with below info
N=1
C=2
H=4
W=4
input = torch.empty(N,C,H,W).random_(256)
# input = torch.randn(2,3,32,64)
print("Input Tensor:
", input)
print("Input Size:",input.size())
# Perform transpose convolution operation
output = convt(input)
print("Output Tensor:
", output)
print("Output Size:",output.size())
# With square kernels (3,3) and equal stride
convt = nn.ConvTranspose2d(2, 3, 3, stride=2)
output = convt(input)
print("Output Size:",output.size())
# non-square kernels and unequal stride and with padding
convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2))
output = convt(input)
print("Output Size:",output.size())
# non-square kernels and unequal stride and with padding and dilation
convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1),
padding=(4, 2), dilation=(3, 1))
output = convt(input)
print("Output Size:",output.size())输出
Input Tensor: tensor([[[[115., 76., 102., 6.], [221., 173., 23., 205.], [123., 23., 112., 18.], [189., 178., 167., 143.]], [[239., 180., 226., 88.], [224., 30., 196., 224.], [ 57., 222., 47., 84.], [ 25., 255., 201., 114.]]]]) Input Size: torch.Size([1, 2, 4, 4]) Output Tensor: tensor([[[[ 48.1156, 64.6112, 64.9630, 47.2604, 3.9925], [74.9169, 80.7055, 138.8992, 82.8471, 54.3722], [20.0938, 49.5610, 30.2914, 93.3563, 3.1597], [-27.1410, 118.8138, 92.8670, 50.6170, 37.5564], [-27.7676, 6.5762, 33.6408, 6.7176, -8.8372]], [[ -18.2188, -56.5362, -49.8063, -43.3336, -16.8645], [ -23.4012, -6.1607, 40.5064, -17.4547, -25.1738], [ -5.7752, 53.6838, -27.9412, 36.7660, 44.0866], [ -23.5205, 1.1443, -29.0826, -34.7213, -4.1535], [ 5.6746, 38.4026, 72.8414, 59.2990, 34.9241]], [[ -35.0380, -31.4031, -38.0059, -19.3247, -5.6272], [-109.2401, -12.9763, -62.2776, -31.0825, 19.2766], [ -93.6596, -18.5403, -67.5457, -61.8533, 32.3005], [ -27.7020, -71.3938, -18.9532, -26.8304, 20.0184], [ -29.2334, -85.8179, -35.4292, -16.4065, 19.0788]]]], grad_fn=<SlowConvTranspose2DBackward>) Output Size: torch.Size([1, 3, 5, 5]) Output Size: torch.Size([1, 3, 9, 9]) Output Size: torch.Size([1, 3, 1, 4]) Output Size: torch.Size([1, 3, 5, 4])
示例2
在下面的Python示例中,我们对输入图像执行二维转置卷积运算。为了应用二维转置卷积,我们首先将图像转换为torch张量,并在转置卷积之后,再次将其转换为PIL图像以进行可视化。
# Python program to perform 2D transpose convolution operation
# Import the required libraries
import torch
import torchvision
from PIL import Image
import torchvision.transforms as T
# Read input image
img = Image.open('car.jpg')
# convert the input image to torch tensor
img = T.ToTensor()(img)
print("Input image size:", img.size()) # size = [3, 464, 700]
# unsqueeze the image to make it 4D tensor
img = img.unsqueeze(0) # image size = [1, 3, 464, 700]
# define transpose convolution layer
# convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
convt = torch.nn.ConvTranspose2d(3, 3, 2)
# apply transpose convolution operation on image
img = convt(img)
# squeeze image to make it 3D
img = img.squeeze(0) # now image is again 3D
print("Output image size:",img.size())
# convert image to PIL image
img = T.ToPILImage()(img)
# display the image after convolution
img.show()
'''
Note: You may get different output image after the convolution operation
because the weights initialized may be different at different runs.
'''输出
Input image size: torch.Size([3, 464, 700]) Output image size: torch.Size([3, 465, 701])

请注意,由于**权重**和**偏差**的初始化,您可能会在每次运行后看到获得的图像的一些变化。
数据结构
网络
关系数据库管理系统 (RDBMS)
操作系统
Java
iOS
HTML
CSS
Android
Python
C语言编程
C++
C#
MongoDB
MySQL
Javascript
PHP