如何在PyTorch中应用二维转置卷积运算？

我们可以使用**torch.nn.ConvTranspose2d()**模块对包含多个输入平面的输入图像应用二维转置卷积运算。此模块可以看作是**Conv2d**关于其输入的梯度。

二维转置卷积层的输入大小必须为**[N,C,H,W]**，其中**N**是批大小，**C**是通道数，**H**和**W**分别是输入图像的高度和宽度。

通常，二维转置卷积运算应用于图像张量。对于RGB图像，通道数为3。转置卷积运算的主要特征是滤波器或内核大小和步幅。此模块支持**TensorFloat32**。

语法

torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size)

参数

**in_channels** – 输入图像中的通道数。
**out_channels** – 转置卷积运算产生的通道数。
**kernel_size** – 卷积核的大小。

除了以上三个参数外，还有一些可选参数，例如**stride、padding、dilation**等。我们将在下面的Python示例中详细介绍这些参数。

步骤

您可以使用以下步骤应用二维转置卷积运算：

导入所需的库。在以下所有示例中，所需的Python库是**torch**。确保您已经安装它。要在图像上应用二维转置卷积运算，我们还需要**torchvision**和**Pillow**。

import torch
import torchvision
from PIL import Image

定义**输入**张量或读取输入图像。如果输入是图像，则我们首先将其转换为torch张量。
定义**in_channels、out_channels、kernel_size**和其他参数。
接下来，通过将上述定义的参数传递给**torch.nn.ConvTranspose2d()**来定义转置卷积运算convt。

convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)

将转置卷积运算convt应用于输入张量或图像张量。

output = convt(input)

接下来打印转置卷积运算后的张量。如果输入是图像张量，则要可视化图像，我们首先将转置卷积运算后获得的张量转换为PIL图像，然后可视化图像。

让我们来看一些示例，以便更清楚地理解。

输入图像

我们将在示例2中使用以下图像作为输入文件。

示例1

在下面的Python示例中，我们对输入张量执行二维转置卷积运算。我们应用**kernel_size、stride、padding**和**dilation**的不同组合。

# Python 3 program to perform 2D transpose convolution operation
import torch
import torch.nn as nn

'''torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0)

'''

in_channels = 2
out_channels = 3
kernel_size = 2

convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)

# conv = nn.ConvTranspose2d(3, 6, 2)

'''input of size [N,C,H, W]
N==>batch size,
C==> number of channels,
H==> height of input planes in pixels,
W==> width in pixels.
'''

# define the input with below info
N=1
C=2
H=4
W=4
input = torch.empty(N,C,H,W).random_(256)
# input = torch.randn(2,3,32,64)
print("Input Tensor:", input)
print("Input Size:",input.size())

# Perform transpose convolution operation
output = convt(input)
print("Output Tensor:", output)
print("Output Size:",output.size())

# With square kernels (3,3) and equal stride
convt = nn.ConvTranspose2d(2, 3, 3, stride=2)
output = convt(input)
print("Output Size:",output.size())

# non-square kernels and unequal stride and with padding
convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2))
output = convt(input)
print("Output Size:",output.size())

# non-square kernels and unequal stride and with padding and dilation
convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1),
padding=(4, 2), dilation=(3, 1))
output = convt(input)
print("Output Size:",output.size())

输出

Input Tensor:
   tensor([[[[115., 76., 102., 6.],
      [221., 173., 23., 205.],
      [123., 23., 112., 18.],
      [189., 178., 167., 143.]],

      [[239., 180., 226., 88.],
      [224., 30., 196., 224.],
      [ 57., 222., 47., 84.],
      [ 25., 255., 201., 114.]]]])
Input Size: torch.Size([1, 2, 4, 4])
Output Tensor:
   tensor([[[[ 48.1156, 64.6112, 64.9630, 47.2604, 3.9925],
      [74.9169, 80.7055, 138.8992, 82.8471, 54.3722],
      [20.0938, 49.5610, 30.2914, 93.3563, 3.1597],
      [-27.1410, 118.8138, 92.8670, 50.6170, 37.5564],
      [-27.7676, 6.5762, 33.6408, 6.7176, -8.8372]],
      [[ -18.2188, -56.5362, -49.8063, -43.3336, -16.8645],
      [ -23.4012, -6.1607, 40.5064, -17.4547, -25.1738],
      [ -5.7752, 53.6838, -27.9412, 36.7660, 44.0866],
      [ -23.5205, 1.1443, -29.0826, -34.7213, -4.1535],
      [ 5.6746, 38.4026, 72.8414, 59.2990, 34.9241]],
      [[ -35.0380, -31.4031, -38.0059, -19.3247, -5.6272],
      [-109.2401, -12.9763, -62.2776, -31.0825, 19.2766],
      [ -93.6596, -18.5403, -67.5457, -61.8533, 32.3005],
      [ -27.7020, -71.3938, -18.9532, -26.8304, 20.0184],
      [ -29.2334, -85.8179, -35.4292, -16.4065, 19.0788]]]],
   grad_fn=<SlowConvTranspose2DBackward>)
Output Size: torch.Size([1, 3, 5, 5])
Output Size: torch.Size([1, 3, 9, 9])
Output Size: torch.Size([1, 3, 1, 4])
Output Size: torch.Size([1, 3, 5, 4])

Explore our latest online courses and learn new skills at your own pace. Enroll and become a certified expert to boost your career.

示例2

在下面的Python示例中，我们对输入图像执行二维转置卷积运算。为了应用二维转置卷积，我们首先将图像转换为torch张量，并在转置卷积之后，再次将其转换为PIL图像以进行可视化。

# Python program to perform 2D transpose convolution operation
# Import the required libraries
import torch
import torchvision
from PIL import Image
import torchvision.transforms as T

# Read input image
img = Image.open('car.jpg')

# convert the input image to torch tensor
img = T.ToTensor()(img)
print("Input image size:", img.size()) # size = [3, 464, 700]

# unsqueeze the image to make it 4D tensor
img = img.unsqueeze(0) # image size = [1, 3, 464, 700]

# define transpose convolution layer
# convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
convt = torch.nn.ConvTranspose2d(3, 3, 2)

# apply transpose convolution operation on image
img = convt(img)
# squeeze image to make it 3D
img = img.squeeze(0) # now image is again 3D
print("Output image size:",img.size())

# convert image to PIL image
img = T.ToPILImage()(img)

# display the image after convolution
img.show()

'''
Note: You may get different output image after the convolution operation
because the weights initialized may be different at different runs.
'''

输出

Input image size: torch.Size([3, 464, 700])
Output image size: torch.Size([3, 465, 701])

请注意，由于**权重**和**偏差**的初始化，您可能会在每次运行后看到获得的图像的一些变化。

Shahid Akhtar Khan

更新于：2022年1月25日

2K+浏览量

启动您的职业生涯

通过完成课程获得认证

开始