在 R 中,哪些数据集包含缺失值?
讲师/教育工作者经常需要向学生教授缺失值插补,因此他们需要包含一些缺失值的数据集,或者需要创建数据集。在 R 中,我们也有一些包含缺失值的数据集,例如 base R 中的 airquality 数据和 VIM 包中的 food 数据。可能还有许多其他包包含包含缺失值的数据集,但探索它们需要花费大量时间。因此,我们分享了 airquality 的示例以及 VIM 包中的一些数据集。
示例 1
head(airquality,20)
输出
Ozone Solar.R Wind Temp Month Day 1 41 190 7.4 67 5 1 2 36 118 8.0 72 5 2 3 12 149 12.6 74 5 3 4 18 313 11.5 62 5 4 5 NA NA 14.3 56 5 5 6 28 NA 14.9 66 5 6 7 23 299 8.6 65 5 7 8 19 99 13.8 59 5 8 9 8 19 20.1 61 5 9 10 NA 194 8.6 69 5 10 11 7 NA 6.9 74 5 11 12 16 256 9.7 69 5 12 13 11 290 9.2 66 5 13 14 14 274 10.9 68 5 14 15 18 65 13.2 58 5 15 16 14 334 11.5 64 5 16 17 34 307 12.0 66 5 17 18 6 78 18.4 57 5 18 19 30 322 11.5 68 5 19 20 11 44 9.7 62 5 20
示例
> summary(airquality)
输出
Ozone Solar.R Wind Temp Min. : 1.00 Min. : 7.0 Min. : 1.700 Min. :56.00 1st Qu.: 18.00 1st Qu.:115.8 1st Qu.: 7.400 1st Qu.:72.00 Median : 31.50 Median :205.0 Median : 9.700 Median :79.00 Mean : 42.13 Mean :185.9 Mean : 9.958 Mean :77.88 3rd Qu.: 63.25 3rd Qu.:258.8 3rd Qu.:11.500 3rd Qu.:85.00 Max. :168.00 Max. :334.0 Max. :20.700 Max. :97.00 NA's :37 NA's :7 Month Day Min. :5.000 Min. : 1.0 1st Qu.:6.000 1st Qu.: 8.0 Median :7.000 Median :16.0 Mean :6.993 Mean :15.8 3rd Qu.:8.000 3rd Qu.:23.0 Max. :9.000 Max. :31.0
示例 2
加载 VIM 包 -
> library(VIM) > summary(SBS5242)
输出
PW BWS_F Umsatz PERSA Min. : 21.5 Min. : 10.22 Min. : 5.003 Min. : -0.60 1st Qu.: 365.1 1st Qu.: 171.30 1st Qu.: 306.267 1st Qu.: 3.21 Median : 985.1 Median : 503.18 Median : 801.245 Median : 105.09 Mean : 2515.4 Mean : 1406.27 Mean : 2100.978 Mean : 3147.40 3rd Qu.: 2691.6 3rd Qu.: 1309.43 3rd Qu.: 2548.480 3rd Qu.: 1002.91 Max. :43888.8 Max. :35081.07 Max. :23558.504 Max. :127175.91 NA's :5 NA's :5 NA's :5 NA's :5 BEZWD BEZWDVK BESCH USB Min. : 10.43 Min. : -0.6602 Min. : -0.1676 Min. : -0.6841 1st Qu.: 192.40 1st Qu.: 0.0000 1st Qu.: 3.9790 1st Qu.: 0.5444 Median : 517.46 Median : 5.5174 Median : 9.4356 Median : 4.7794 Mean : 1453.76 Mean : 18.1511 Mean : 17.6972 Mean : 12.0593 3rd Qu.: 1417.21 3rd Qu.: 20.4039 3rd Qu.: 20.1053 3rd Qu.: 18.0577 Max. :37577.19 Max. :379.0521 Max. :310.0948 Max. :105.3674 NA's :5 NA's :5 NA's :5 NA's :5 ISACH Min. : -0.925 1st Qu.: 3.753 Median : 31.026 Mean : 191.003 3rd Qu.: 127.189 Max. :6575.334 NA's :5
示例 3
summary(bcancer)
输出
ID clump_thickness uniformity_cellsize uniformity_cellshape Min. : 61634 Min. : 1.000 Min. : 1.000 Min. : 1.000 1st Qu.: 870688 1st Qu.: 2.000 1st Qu.: 1.000 1st Qu.: 1.000 Median : 1171710 Median : 4.000 Median : 1.000 Median : 1.000 Mean : 1071704 Mean : 4.418 Mean : 3.134 Mean : 3.207 3rd Qu.: 1238298 3rd Qu.: 6.000 3rd Qu.: 5.000 3rd Qu.: 5.000 Max. :13454352 Max. :10.000 Max. :10.000 Max. :10.000 adhesion epithelial_cellsize bare_nuclei chromatin Min. : 1.000 Min. : 1.000 Min. : 1.000 Min. : 1.000 1st Qu.: 1.000 1st Qu.: 2.000 1st Qu.: 1.000 1st Qu.: 2.000 Median : 1.000 Median : 2.000 Median : 1.000 Median : 3.000 Mean : 2.807 Mean : 3.216 Mean : 3.545 Mean : 3.438 3rd Qu.: 4.000 3rd Qu.: 4.000 3rd Qu.: 6.000 3rd Qu.: 5.000 Max. :10.000 Max. :10.000 Max. :10.000 Max. :10.000 NA's :16 normal_nucleoli mitoses class Min. : 1.000 Min. : 1.000 benign :458 1st Qu.: 1.000 1st Qu.: 1.000 malignant:241 Median : 1.000 Median : 1.000 Mean : 2.867 Mean : 1.589 3rd Qu.: 4.000 3rd Qu.: 1.000 Max. :10.000 Max. :10.000
示例 4
summary(brittleness)
输出
TK104 TK105 TK107 Min. :188.0 Min. :223.0 Min. :240.0 1st Qu.:369.5 1st Qu.:370.0 1st Qu.:425.0 Median :423.5 Median :460.0 Median :479.0 Mean :421.0 Mean :472.2 Mean :470.1 3rd Qu.:482.2 3rd Qu.:549.0 3rd Qu.:548.5 Max. :697.0 Max. :709.0 Max. :733.0 NA's :3 NA's :2
示例 5
summary(food)
输出
Country Real.coffee Instant.coffee Tea Sweetener Austria: 1 Min. :27.00 Min. :10.00 Min. :40.00 Min. : 2.0 Belgium: 1 1st Qu.:71.50 1st Qu.:17.00 1st Qu.:62.50 1st Qu.:11.0 Denmark: 1 Median :89.00 Median :39.00 Median :84.50 Median :19.0 England: 1 Mean :78.56 Mean :39.25 Mean :78.50 Mean :18.0 Finland: 1 3rd Qu.:96.00 3rd Qu.:54.25 3rd Qu.:92.25 3rd Qu.:26.5 France : 1 Max. :98.00 Max. :86.00 Max. :99.00 Max. :35.0 (Other):10 NA's :1 Biscuits Powder.soup Tin.soup Potatoes Min. :22.00 Min. :27.00 Min. : 1.00 Min. : 2.00 1st Qu.:56.00 1st Qu.:36.25 1st Qu.: 3.75 1st Qu.: 6.50 Median :62.00 Median :47.00 Median :11.50 Median :10.00 Mean :60.67 Mean :49.00 Mean :18.31 Mean :12.75 3rd Qu.:75.00 3rd Qu.:58.00 3rd Qu.:20.00 3rd Qu.:17.00 Max. :91.00 Max. :75.00 Max. :76.00 Max. :39.00 NA's :1 Frozen.fish Frozen.veggies Apples Oranges Min. : 4.00 Min. : 2.00 Min. :22.00 Min. :42.00 1st Qu.:13.75 1st Qu.: 6.50 1st Qu.:56.75 1st Qu.:65.25 Median :19.50 Median :13.00 Median :71.50 Median :72.00 Mean :21.88 Mean :15.88 Mean :66.81 Mean :70.50 3rd Qu.:26.25 3rd Qu.:21.50 3rd Qu.:81.00 3rd Qu.:77.25 Max. :54.00 Max. :45.00 Max. :87.00 Max. :94.00 Tinned.fruit Jam Garlic Butter Min. : 8.00 Min. :16.00 Min. : 5.00 Min. :31.00 1st Qu.:28.00 1st Qu.:40.25 1st Qu.:11.00 1st Qu.:64.50 Median :43.00 Median :54.00 Median :25.50 Median :83.00 Mean :41.94 Mean :55.19 Mean :42.31 Mean :75.81 3rd Qu.:50.75 3rd Qu.:72.00 3rd Qu.:81.50 3rd Qu.:94.00 Max. :89.00 Max. :91.00 Max. :91.00 Max. :97.00 Margarine Olive.oil Yoghurt Crisp.bread Min. :24.00 Min. :13.00 Min. : 2.00 Min. : 3.00 1st Qu.:47.75 1st Qu.:29.50 1st Qu.: 5.50 1st Qu.:10.50 Median :79.00 Median :52.50 Median :13.00 Median :21.00 Mean :69.12 Mean :54.19 Mean :20.53 Mean :27.75 3rd Qu.:94.00 3rd Qu.:83.25 3rd Qu.:30.50 3rd Qu.:31.00 Max. :97.00 Max. :94.00 Max. :57.00 Max. :93.00 NA's :1
广告