如何使用TensorFlow将花卉数据集拆分为训练集和验证集？

可以使用 Keras 预处理 API 将花卉数据集拆分为训练集和验证集，借助“image_dataset_from_directory”函数，该函数需要验证集的百分比拆分。

使用 keras.Sequential 模型创建图像分类器，并使用 **preprocessing.image_dataset_from_directory** 加载数据。数据可以有效地从磁盘加载。识别过拟合并应用技术来缓解它。这些技术包括数据增强和 dropout。有 3700 张花卉图像。此数据集包含 5 个子目录，每个类有一个子目录。它们是：雏菊、蒲公英、玫瑰、向日葵和郁金香。

我们正在使用 Google Colaboratory 来运行以下代码。Google Colab 或 Colaboratory 帮助在浏览器上运行 Python 代码，无需任何配置，并可免费访问 GPU（图形处理单元）。Colaboratory 建立在 Jupyter Notebook 之上。

batch_size = 32
img_height = 180
img_width = 180
print("The data is being split into training and validation set")
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
   data_dir,
   validation_split=0.2,
   subset="training",
   seed=123,
   image_size=(img_height, img_width),
   batch_size=batch_size)

代码来源：https://tensorflowcn.cn/tutorials/images/classification

输出

The data is being split into training and validation set
Found 3670 files belonging to 5 classes.
Using 2936 files for training.

解释

这些图像使用 image_dataset_from_directory 实用程序从磁盘加载。
这将从磁盘上的图像目录转换为 tf.data.Dataset。
下载数据后，将为加载程序定义一些参数。
数据被拆分为训练集和验证集。

AmitDiwan

更新于： 2021年2月20日

352 次浏览

开启你的职业生涯

通过完成课程获得认证

开始学习

如何使用TensorFlow将花卉数据集拆分为训练集和验证集？

输出

解释

开启你的 职业生涯

开启你的职业生涯