如何使用 Python 和 TensorFlow 来可视化数据？

假设我们有一个花卉数据集。可以使用 Google API 下载该数据集，该 API 基本上链接到花卉数据集。“get_file”方法可以用来传递 API 作为参数。完成此操作后，数据将下载到环境中。

可以使用“matplotlib”库对其进行可视化。“imshow”方法用于在控制台上显示图像。

我们将使用 Keras Sequential API，这有助于构建一个顺序模型，用于处理简单的层堆栈，其中每一层只有一个输入张量和一个输出张量。

使用 keras.Sequential 模型创建一个图像分类器，并使用 preprocessing.image_dataset_from_directory 加载数据。数据有效地从磁盘加载。识别过拟合并应用技术来减轻它。这些技术包括数据增强和 dropout。共有 3700 张花卉图片。此数据集包含 5 个子目录，每个类别一个子目录：

雏菊（daisy），
蒲公英（dandelion），
玫瑰（roses），
向日葵（sunflowers）和
郁金香（tulips）。

我们使用 Google Colaboratory 来运行以下代码。Google Colab 或 Colaboratory 帮助在浏览器上运行 Python 代码，无需任何配置，并可免费访问 GPU（图形处理单元）。Colaboratory 建立在 Jupyter Notebook 之上。

print("Visualizing the dataset")
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
   for i in range(6):
      ax = plt.subplot(3, 3, i + 1)
      plt.imshow(images[i].numpy().astype("uint8"))
      plt.title(class_names[labels[i]])
      plt.axis("off")

for image_batch, labels_batch in train_ds:
   print(image_batch.shape)
   print(labels_batch.shape)
   break

代码来源： https://tensorflowcn.cn/tutorials/images/classification

输出

Visualizing the dataset
(32, 180, 180, 3)
(32,)

解释

使用 fit 方法训练数据后，还可以手动迭代数据集以检索图像批次。
这些数据显示在控制台上。
image_batch 是形状为 (32, 180, 180, 3) 的张量。
这是一个包含 32 个形状为 180x180x3 的图像的批次。
label_batch 是形状为 (32,) 的张量，这些是与 32 个图像对应的标签。
可以对 image_batch 和 labels_batch 张量调用 .numpy() 以将其转换为 numpy.ndarray。

AmitDiwan

更新于： 2021年2月20日

212 次浏览

启动你的职业生涯

完成课程获得认证

开始学习