如何在PyTorch中读取JPEG或PNG图像？

读取图像在图像处理或计算机视觉相关任务中非常重要。**torchvision.io** 包提供了执行不同**IO**操作的函数。要读取图像，**torchvision.io** 包提供了 **image_read()** 函数。此函数读取**JPEG**和**PNG**图像。它返回一个**3D RGB**或**灰度**张量。

张量的三个维度对应于**[C,H,W]。C**是通道数，**W**和**H**分别是图像的宽度和高度。

对于**RGB**，通道数为3。因此，读取图像的输出是一个**[3,H,W]**的张量。输出张量的值范围为**[0,255]**。

语法

torchvision.io.read_image(path)

参数

**path** - 输入JPEG或PNG图像路径。

输出

它返回一个大小为**[image_channels, image_height, image_width]**的torch张量。

步骤

您可以使用以下步骤在PyTorch中读取和可视化JPEG或PNG图像。

导入所需的库。在以下所有示例中，所需的Python库是**torch**和**torchvision**。确保您已经安装了它们。

import torch
import torchvision
from torchvision.io import read_image
import torchvision.transforms as T

使用**image_read()**函数读取**JPEG**或**PNG**图像。使用图像类型（.jpg或.png）指定完整的图像路径。此函数的输出是一个大小为**[image_channels, image_height, image_width]**的torch张量。

img = read_image('butterfly.jpg')

可以选择计算不同的图像属性，即图像类型、图像大小等。
要显示图像，我们首先将图像张量转换为PIL图像，然后显示图像。

img = T.ToPILImage()(img)
img.show()

输入图像

我们将在以下示例中使用这些图像作为输入文件。

示例1

以下是使用PyTorch读取JPEG图像的完整Python代码。

# Import the required libraries
import torch
import torchvision
from torchvision.io import read_image
import torchvision.transforms as T

# read a JPEG image
img = read_image('butterfly.jpg')

# display the image properties
print("Image data:
", img)

# check if input image is a PyTorch tensor
print("Is image a PyTorch Tensor:", torch.is_tensor(img))
print("Type of Image:", type(img))

# size of the image
print(img.size())

# convert the torch tensor to PIL image
img = T.ToPILImage()(img)

# display the image
img.show()

输出

Image data:
   tensor([[[146, 169, 191, ..., 71, 61, 53],
      [140, 169, 192, ..., 75, 63, 53],
      [126, 161, 186, ..., 85, 68, 58],
      ...,
      [ 33, 31, 30, ..., 218, 221, 223],
      [ 30, 30, 31, ..., 216, 219, 224],
      [ 41, 45, 52, ..., 218, 219, 220]],

      [[130, 151, 170, ..., 47, 41, 35],
      [124, 151, 171, ..., 52, 42, 36],
      [110, 145, 168, ..., 61, 48, 39],
      ...,
      [ 29, 26, 25, ..., 197, 198, 200],
      [ 25, 25, 26, ..., 195, 198, 200],
      [ 20, 25, 33, ..., 200, 201, 202]],

      [[ 79, 101, 123, ..., 21, 17, 13],
      [ 73, 101, 126, ..., 21, 13, 10],
      [ 61, 96, 122, ..., 23, 11, 6],
      ...,
      [ 20, 20, 19, ..., 166, 167, 169],
      [ 19, 19, 20, ..., 164, 167, 172],
      [ 25, 27, 29, ..., 164, 165, 166]]],
dtype=torch.uint8)
Is image a PyTorch Tensor: True
Type of Image:
torch.Size([3, 465, 700])

请注意，**image_read()**的输出是torch张量，值范围为[0,255]，张量的类型为**torch.uint8**。

示例2

在此Python代码中，我们将看到如何使用PyTorch读取**png**图像。

import torch
import torchvision

# read a png image
img = torchvision.io.read_image('elephant.png')

# display the properties of image
print("Image data:
", img)
print(img.size())
print(type(img))

# display the png image
# convert the image tensor to PIL image
img = torchvision.transforms.ToPILImage()(img)

# display the PIL image
img.show()

输出

Image data:
   tensor([[[ 14, 13, 11, ..., 22, 21, 13],
      [ 13, 12, 9, ..., 24, 27, 21],
      [ 12, 10, 7, ..., 26, 33, 32],
      ...,
      [ 54, 15, 25, ..., 39, 76, 111],
      [ 79, 29, 32, ..., 38, 61, 84],
      [112, 60, 60, ..., 23, 47, 72]],

      [[ 14, 13, 11, ..., 11, 11, 5],
      [ 13, 12, 9, ..., 14, 17, 13],
      [ 12, 10, 7, ..., 15, 23, 23],
      ...,
      [ 38, 0, 9, ..., 25, 62, 97],
      [ 58, 8, 9, ..., 28, 50, 70],
      [ 91, 39, 37, ..., 13, 36, 58]],

      [[ 12, 11, 9, ..., 15, 12, 2],
      11, 10, 7, ..., 15, 16, 10],
      [ 10, 8, 5, ..., 13, 21, 18],
      ...,
      [ 38, 0, 9, ..., 24, 61, 96],
      [ 65, 15, 15, ..., 27, 48, 67],
      [ 98, 46, 43, ..., 12, 34, 55]]],
dtype=torch.uint8)
Is image a PyTorch Tensor: True
torch.Size([3, 466, 700])
<class 'torch.Tensor'>

请注意，**image_read()**的输出是torch张量，值范围为[0,255]，张量的类型为**torch.uint8**。

Shahid Akhtar Khan

更新于：2022年1月20日

7K+ 次浏览

启动你的职业生涯

完成课程获得认证

开始学习