如何使用TensorFlow标准化花卉数据集？

数据标准化是指将数据集缩放至一个级别，以便所有特征都可以使用等效单位表示。缩放层是使用Keras模块中的“Rescaling”方法构建的。该层使用“map”方法应用于整个数据集。

我们将使用花卉数据集，其中包含数千张花的图像。它包含5个子目录，每个类都有一个子目录。

我们使用Google Colaboratory运行以下代码。Google Colab或Colaboratory有助于通过浏览器运行Python代码，无需任何配置，并且可以免费访问GPU（图形处理单元）。Colaboratory构建在Jupyter Notebook之上。

from tensorflow.keras import layers
print("Standardizing the data using a rescaling layer")
normalization_layer = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)

print("This layer can be applied by calling the map function on the dataset")
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
print(np.min(first_image), np.max(first_image))

代码来源：https://tensorflowcn.cn/tutorials/load_data/images

输出

Standardizing the data using a rescaling layer
This layer can be applied by calling the map function on the dataset
0.0 0.96902645

解释

RGB通道值范围为0到255。
这对神经网络来说并不理想。
我们的目标是使输入数据尽可能小。
图像中的值已标准化为0到1的范围。
这是借助缩放层实现的。
另一种方法是在模型定义中包含此缩放层，这将简化部署。

AmitDiwan

更新于：2021年2月19日

浏览量：115

开启你的职业生涯

完成课程获得认证

开始学习