如何在 Pandas 中获得两列之间的相关性?
我们可以使用.corr()方法在 Pandas 中获取两列之间的相关性。我们来看一个示例,了解如何应用此方法。
步骤
- 创建一个二维、大小可变、可能是异构的表格数据 df。
- 打印输入的 DataFrame df。
- 初始化两个变量 col1 和 col2,并将要找出其相关性的列分配给他们。
- 使用df[col1].corr(df[col2]) 找出 col1 和 col2 之间的关系,并将关系值保存在一个变量 corr 中。
- 打印相关性值 corr。
示例
import pandas as pd df = pd.DataFrame( { "x": [5, 2, 7, 0], "y": [4, 7, 5, 1], "z": [9, 3, 5, 1] } ) print "Input DataFrame is:\n", df col1, col2 = "x", "y" corr = df[col1].corr(df[col2]) print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2) col1, col2 = "x", "x" corr = df[col1].corr(df[col2]) print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2) col1, col2 = "x", "z" corr = df[col1].corr(df[col2]) print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2) col1, col2 = "y", "x" corr = df[col1].corr(df[col2]) print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)
输出
Input DataFrame is: x y z 0 5 4 9 1 2 7 3 2 7 5 5 3 0 1 1 Correlation between x and y is: 0.41 Correlation between x and x is: 1.0 Correlation between x and z is: 0.72 Correlation between y and x is: 0.41
广告