如何在 Pandas DataFrame 中计算项目集的频率
使用 Series.value_counts() 方法计算项目集的频率。首先,让我们创建一个 DataFrame −
# Create DataFrame dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Porsche', 'Lamborghini', 'BMW'], 'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Bangalore', 'Hyderabad', 'Mumbai', 'Mumbai','Pune'], 'UnitsSold': [95, 80, 80, 75, 92, 90, 95, 50 ]})
使用 value_counts() 方法计算 car 列的频率 −
# counting frequency of column Car count1 = dataFrame['Car'].value_counts() print("\nCount in column Car") print(count1)
同样,计算其他列的频率。以下是完整的代码,用于计算 Pandas DataFrame 中项目集的频率 −
示例
import pandas as pd # Create DataFrame dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Porsche', 'Lamborghini', 'BMW'], 'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Bangalore', 'Hyderabad', 'Mumbai', 'Mumbai', 'Pune'], 'UnitsSold': [95, 80, 80, 75, 92, 90, 95, 50 ]}) print("Dataframe...") print(dataFrame) # counting frequency of column Car count1 = dataFrame['Car'].value_counts() print("\nCount in column Car") print(count1) # counting frequency of column Place count2 = dataFrame['Place'].value_counts() print("\nCount in column Place") print(count2) # counting frequency of column Car count3 = dataFrame['UnitsSold'].value_counts() print("\nCount in column UnitsSold") print(count3)
输出
将产生以下输出 −
Dataframe... Car Place UnitsSold 0 BMW Delhi 95 1 Mercedes Hyderabad 80 2 Lamborghini Chandigarh 80 3 Audi Bangalore 75 4 Mercedes Hyderabad 92 5 Porsche Mumbai 90 6 Lamborghini Mumbai 95 7 BMW Pune 50 Count in column Car BMW 2 Lamborghini 2 Mercedes 2 Audi 1 Porsche 1 Name: Car, dtype: int64 Count in column Place Mumbai 2 Hyderabad 2 Chandigarh 1 Pune 1 Delhi 1 Bangalore 1 Name: Place, dtype: int64 Count in column UnitsSold 95 2 80 2 92 1 75 1 90 1 50 1 Name: UnitsSold, dtype: int64
广告