如何使用 Pandas 在 Python 中创建透视表？

透视表是一个强大的数据分析工具，允许您根据不同的维度汇总和聚合数据。在 Python 中，您可以使用 pandas 库创建透视表，该库提供了灵活且高效的数据处理和分析工具。

要在 pandas 中创建透视表，您首先需要在 pandas DataFrame 中拥有一个数据集。您可以从各种来源（例如 CSV 文件、Excel 电子表格、SQL 数据库等）将数据加载到 DataFrame 中。

一旦您的数据在 DataFrame 中，您可以使用 pandas 的 `pivot_table()` 函数创建透视表。以下是其语法：

dataframe.pivot(self, index=None, columns=None, values=None, aggfunc)

`pivot_table()` 函数接受多个参数，包括要使用的 DataFrame、索引列、用作透视表列的列以及要聚合的值列。您还可以指定要使用的聚合函数，例如 sum、mean、max、min 等。

在我们深入研究 pivot 和 `pivot_table()` 函数之前，让我们首先创建一个我们将要使用的 DataFrame。

Pandas 中的 DataFrame

在 pandas 中，DataFrame 是一种二维带标签的数据结构，其列可能具有不同的类型。它是 pandas 中用于数据处理和分析的主要数据结构。

DataFrame 可以看作是电子表格或 SQL 表，具有行和列。它允许轻松处理和操作数据，包括索引、选择、过滤、合并和分组。

考虑以下代码。此代码使用 Python 字典创建了一个名为 df 的 DataFrame 对象，其中包含四列：“产品”、“类别”、“数量”和“金额”。字典的每个键对应于列的名称，其值是包含该列值的列表。

示例

# importing pandas library
import pandas as pd

# creating a dataframe from a dictionary

# creating a column 'Product', 'Category', 'Quantity','Amount' with its values
df = pd.DataFrame({
   'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
   'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
   'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
   'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})

# print the dataframe
print(df)

输出

执行此代码时，它将在终端上产生以下输出：

  Product  Category  Quantity Amount
0  Litchi   Fruit      8       270
1  Broccoli Vegetable  5       239
2  Banana   Fruit      3       617
3  Banana   Fruit      4       384
4  Beans    Vegetable  5       626
5  Orange   Fruit      9       610
6  Mango    Fruit      11      62
7  Banana   Fruit       8      90

使用 Pandas 创建透视表

现在让我们使用 `pivot_table()` 函数创建一个总销售额的透视表。考虑以下代码。

示例

# importing pandas library
import pandas as pd

# creating a dataframe from a dictionary

# creating a column 'Product', 'Category', 'Quantity','Amount' with its values
df = pd.DataFrame({
   'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
   'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
   'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
   'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})

# creating pivot table of total sales

# product-wise
pivot = df.pivot_table(index =['Product'], values =['Amount'], aggfunc ='sum')
print(pivot)

# print the dataframe
print(df)

解释

它创建了一个名为 df 的 DataFrame 对象，其中包含四列：“产品”、“类别”、“数量”和“金额”。每一列都有自己的值，它们是使用 Python 字典创建的。
之后，代码使用 `pivot_table()` 函数创建一个按产品分组的销售数据透视表，并计算每个产品的总销售额。
最后，透视表被打印到控制台以显示每个产品的总销售数据，并且原始 DataFrame 也被打印到控制台以显示生成透视表的原始数据。

输出

执行后，您将在终端上获得以下输出：

Product  Amount
Banana    1091
Beans     626
Broccoli  239
Litchi    270
Mango     62
Orange    610 
  Product  Category  Quantity Amount
0  Litchi   Fruit      8       270
1  Broccoli Vegetable  5       239
2  Banana   Fruit      3       617
3  Banana   Fruit      4       384
4  Beans    Vegetable  5       626
5  Orange   Fruit      9       610
6  Mango    Fruit      11      62
7  Banana   Fruit       8      90

结论

总之，使用 pandas 库在 Python 中创建透视表是分析表格数据和提取有意义见解的强大方法。通过分组数据和计算聚合值，透视表可以帮助您识别数据中可能难以看到的模式和趋势。凭借 pandas 提供的灵活性和易用性，创建透视表从未如此简单。

按照本教程中概述的步骤，您现在应该已经掌握了在 Python 中创建和使用透视表的坚实基础。

Mukul Latiyan

更新于：2023年4月20日

3000+ 次浏览

启动您的职业生涯

通过完成课程获得认证

开始