CatBoost - 分类指标



CatBoost 主要用于分类任务。分类是指将事物归入某个类别。我们使用多种标准来分析 CatBoost 在数据分类方面的性能。

将数据点分类到不同的类别是分类问题的首要目标。CatBoost 提供多种指标来评估模型性能。

可以使用以下必要的指标来评估 CatBoost 分类性能:

准确率

这显示了模型预测的准确率百分比。它是准确预测的总数除以预测总数。虽然在这种情况下此度量最有效,但对于不平衡的数据集(一类明显多于另一类),它可能不是最佳选择。

要找到准确率,我们需要导入 numpy、catboost、sklearn.datasets 和 sklearn.model_selection 等库。

import numpy as np
from catboost import CatBoostClassifier, Pool
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Loading the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Splitting the data into training and testing datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a CatBoostClassifier 
model = CatBoostClassifier(iterations=100, learning_rate=0.1, depth=6, loss_function='MultiClass', verbose=0)

# Create a Pool object 
train_pool = Pool(X_train, label=y_train)
test_pool = Pool(X_test, label=y_test)

# Train the model
model.fit(train_pool)

# Evaluate the model 
metrics = model.eval_metrics(test_pool, metrics=['Accuracy'], plot=True)

# Print the evaluation metrics
accuracy = metrics['Accuracy'][-1]

print(f'Accuracy is: {accuracy:.2f}')

输出

结果表明,该模型非常适合数据集,并且已成功预测了数据集中每个实例。

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))
Accuracy: 1.00

多类别对数损失

多类别对数损失,也称为多类别分类的交叉熵,是对数损失的一种变体,专为多类别分类问题设计。它通过预测多个类别的概率分布来评估预期概率与实际类别标签的匹配程度。

import numpy as np
from catboost import CatBoostClassifier, Pool
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Loading the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a CatBoostClassifier
model = CatBoostClassifier(iterations=100, learning_rate=0.1, depth=6, loss_function='MultiClass', verbose=0)

# Create a Pool object
train_pool = Pool(X_train, label=y_train)
test_pool = Pool(X_test, label=y_test)

# Train the model
model.fit(train_pool)

# Evaluate the model for multi-class classification
metrics = model.eval_metrics(test_pool, metrics=['MultiClass'], plot = True)

# Print the evaluation metrics
multi_class_loss = metrics['MultiClass'][-1]

print(f'Multi-Class Loss: {multi_class_loss:.2f}')

输出

在下面的结果中,0.03 的多类别损失值表明,模型在测试数据集上的多类别分类方面表现良好。

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))
Multi-Class Loss: 0.03

二元对数损失

二元对数损失衡量真实标签与预测概率之间的差异。较低的对数损失值表示更好的性能。此指标对于进行准确预测非常有用。它可以用于需要更精确概率测量的场景,例如欺诈检测或医疗诊断。在讨论二元分类时经常会提到它,二元分类是指数据集只有两个类别的情况。

由于 Iris 数据集有三个类别,因此不适合使用此指标。借助 Breast Cancer 数据集,我们可以看到它只有两个类别:代表乳腺癌存在与否的类别。

import numpy as np
from catboost import CatBoostClassifier, Pool
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Loading the Breast Cancer dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Splitting the data into training and testing datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a CatBoostClassifier
model = CatBoostClassifier(iterations=100, learning_rate=0.1, depth=6, verbose=0)

# Create a Pool object 
train_pool = Pool(X_train, label=y_train)
test_pool = Pool(X_test, label=y_test)

# Train the model
model.fit(train_pool)

# Evaluate the model
metrics = model.eval_metrics(test_pool, metrics=['Logloss'], plot =False)

# Print the evaluation metrics
logloss = metrics['Logloss'][-1]

print(f'Log Loss (Cross-Entropy): {logloss:.2f}')

输出

以下是上述代码的输出:

Log Loss (Cross-Entropy): 0.08

AUC-ROC 和 AUC-PRC

受试者工作特征曲线下面积 (AUR-ROC) 和精确率-召回率曲线下面积 (AUC-PRC) 是二元分类算法的关键指标。AUC-ROC 评估模型区分正负分类的能力,而 AUC-PRC 更侧重于精确率和召回率之间的权衡。

import catboost
from catboost import CatBoostClassifier, Pool
from sklearn import datasets
from sklearn.model_selection import train_test_split

# Loading the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Converting to binary classification by mapping 
y_binary = (y == 2).astype(int)

# Splitting the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y_binary, test_size=0.2, random_state=42)

# Creating a CatBoost classifier with AUC-ROC metric
model = CatBoostClassifier(iterations=500, random_seed=42, eval_metric='AUC')

# Converting the training data into a CatBoost Pool
train_pool = Pool(X_train, label=y_train)

# Training the model
model.fit(train_pool, verbose=100)

validation_pool = Pool(X_test, label=y_test)
eval_result = model.eval_metrics(validation_pool, ['AUC'])['AUC']
metrics = model.eval_metrics(validation_pool, metrics=['PRAUC'],plot = True)
auc_pr = metrics['PRAUC'][-1]

# Print the evaluation metrics

print(f'AUC-PR: {auc_pr:.2f}')

print(f"AUC-ROC: {eval_result[-1]:.4f}")

输出

这将产生以下结果:

Learning rate set to 0.007867
0:	total: 2.09ms	remaining: 1.04s
100:	total: 42.3ms	remaining: 167ms
200:	total: 67.9ms	remaining: 101ms
300:	total: 89.8ms	remaining: 59.4ms
400:	total: 110ms	remaining: 27ms
499:	total: 129ms	remaining: 0us
MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))
AUC-PR: 1.00
AUC-ROC: 1.0000

F1 分数

F1 分数是模型的准确率,即它预测类别的准确程度,以及召回率,即它能够识别该类别的频率。此指标非常适合平衡假阳性和假阴性之间的权衡。此外,我们需要记住,更好的模型往往具有更高的 F1 分数。

import numpy as np
from catboost import CatBoostClassifier, Pool
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Loading the Breast Cancer dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Splitting the data into training and testing datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Creating a CatBoostClassifier
model = CatBoostClassifier(iterations=100, learning_rate=0.1, depth=6, verbose=0)

# Creating a Pool object 
train_pool = Pool(X_train, label=y_train)
test_pool = Pool(X_test, label=y_test)

# Train the model
model.fit(train_pool)

# Evaluate the model 
metrics = model.eval_metrics(test_pool, metrics=['F1'], plot=True)

# Print the evaluation metrics
f1 = metrics['F1'][-1]

print(f'F1 Score: {f1:.2f}')

输出

这将导致以下结果:

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))
F1 Score: 0.98

总结

总之,CatBoost 提供了广泛的指标和评估工具,极大地简化了模型选择和评估过程。它从其内置的分类任务评估指标(如均方误差和对数损失)开始,但也允许使用用户定义的指标进行自定义。通过在训练过程中使用提前停止、交叉验证和监控多个指标的能力,确保了完整的评估。它为需要分类的任务提供了广泛的指标。

广告