Python Pandas - 用众数填充缺失列值
众数是一组值中出现频率最高的值。使用 fillna() 方法并将众数设置为用众数填充缺失列。一开始,让我们使用各个别名导入所需的库,如下所示 −
import pandas as pd import numpy as np
创建一个包含 2 列的 DataFrame。我们使用 Numpy np.NaN 设置了 NaN 值,如下所示 −
dataFrame = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN] } )
找到带有 NaN 的列值的众数,即本文中的 Units 列。使用 mode() 对 Units 列进行替换,其中 NaN 由它所在列的众数替换 −
dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)
示例
以下是完整代码 −
import pandas as pd import numpy as np # Create DataFrame dataFrame = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN] } ) print"DataFrame ...\n",dataFrame # finding mode of the column values with NaN i.e, for Units columns here # Replace NaNs with the mode of the column where it is located dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True) print"\nUpdated Dataframe after filling NaN values with mode...\n",dataFrame
输出
这将产生以下输出 −
DataFrame ... Car Units 0 BMW 100.0 1 Lexus 150.0 2 Lexus NaN 3 Mustang 80.0 4 Bentley NaN 5 Mustang NaN Updated Dataframe after filling NaN values with mode... Car Units 0 BMW 100.0 1 Lexus 150.0 2 Lexus 80.0 3 Mustang 80.0 4 Bentley 80.0 5 Mustang 80.0
广告