如何在 Pandas DataFrame 中计算项目集的频率


使用 Series.value_counts() 方法计算项目集的频率。首先,让我们创建一个 DataFrame −

# Create DataFrame
dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Porsche', 'Lamborghini', 'BMW'],
   'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Bangalore', 'Hyderabad', 'Mumbai', 'Mumbai','Pune'],
   'UnitsSold': [95, 80, 80, 75, 92, 90, 95, 50 ]})

使用 value_counts() 方法计算 car 列的频率 −

# counting frequency of column Car
count1 = dataFrame['Car'].value_counts()
print("\nCount in column Car")
print(count1)

同样,计算其他列的频率。以下是完整的代码,用于计算 Pandas DataFrame 中项目集的频率 −

示例

import pandas as pd

# Create DataFrame
dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Porsche', 'Lamborghini', 'BMW'],
   'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Bangalore', 'Hyderabad', 'Mumbai', 'Mumbai', 'Pune'],
   'UnitsSold': [95, 80, 80, 75, 92, 90, 95, 50 ]})

print("Dataframe...")
print(dataFrame)

# counting frequency of column Car
count1 = dataFrame['Car'].value_counts()
print("\nCount in column Car")
print(count1)

# counting frequency of column Place
count2 = dataFrame['Place'].value_counts()
print("\nCount in column Place")
print(count2)

# counting frequency of column Car
count3 = dataFrame['UnitsSold'].value_counts()
print("\nCount in column UnitsSold")
print(count3)

输出

将产生以下输出 −

Dataframe...
          Car       Place  UnitsSold
0         BMW       Delhi         95
1    Mercedes   Hyderabad         80
2 Lamborghini  Chandigarh         80
3        Audi   Bangalore         75
4    Mercedes   Hyderabad         92
5     Porsche      Mumbai         90
6 Lamborghini      Mumbai         95
7         BMW        Pune         50

Count in column Car
BMW            2
Lamborghini    2
Mercedes       2
Audi           1
Porsche        1
Name: Car, dtype: int64

Count in column Place
Mumbai        2
Hyderabad     2
Chandigarh    1
Pune          1
Delhi         1
Bangalore     1
Name: Place, dtype: int64

Count in column UnitsSold
95     2
80     2
92     1
75     1
90     1
50     1
Name: UnitsSold, dtype: int64

更新于: 09-Sep-2021

273 浏览

开启你的 职业生涯

完成课程获得认证

开始
广告