机器学习 - 网格搜索

上一个

下一个

网格搜索是一种机器学习中的超参数调整技术，它有助于为给定模型找到最佳的超参数组合。它的工作原理是定义一个超参数网格，然后使用所有可能的超参数组合训练模型，以找到性能最佳的组合。

换句话说，网格搜索是一种穷举搜索方法，其中定义了一组超参数，并在这些超参数的所有可能组合上执行搜索，以找到提供最佳性能的最佳值。

Python中的实现

在Python中，可以使用sklearn模块中的GridSearchCV类实现网格搜索。GridSearchCV类以模型、要调整的超参数和评分函数作为输入。然后，它对所有可能的超参数组合执行穷举搜索，并返回提供最佳分数的最佳超参数集。

以下是使用GridSearchCV类在Python中实现网格搜索的示例：

示例

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

# Generate a sample dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2)

# Define the model and the hyperparameters to tune
model = RandomForestClassifier()
hyperparameters = {'n_estimators': [10, 50, 100], 'max_depth': [None, 5, 10]}

# Define the Grid Search object and fit the data
grid_search = GridSearchCV(model, hyperparameters, scoring='accuracy', cv=5)
grid_search.fit(X, y)

# Print the best hyperparameters and the corresponding score
print("Best hyperparameters: ", grid_search.best_params_)
print("Best score: ", grid_search.best_score_)

在此示例中，我们定义了一个RandomForestClassifier模型和一组要调整的超参数，即树的数量（n_estimators）和每棵树的最大深度（max_depth）。然后，我们创建一个GridSearchCV对象并使用fit()方法拟合数据。最后，我们打印最佳超参数集和相应的分数。

输出

执行此代码时，将产生以下输出：

Best hyperparameters: {'max_depth': None, 'n_estimators': 10}
Best score: 0.953

打印页面

上一个

下一个