如何在R中根据字符列计算多个列的平均值?
如果我们有一个字符列,这意味着该列中更可能存在重复值,因此无法直接根据字符列中的值计算数值列的平均值。为此,我们可以使用aggregate函数,如下例所示。
示例1
考虑以下数据框:
set.seed(214) x1<−sample(c("A","B","C"),20,replace=TRUE) x2<−rnorm(20,2,0.25) x3<−rnorm(20,524,32.14) x4<−rnorm(20,0.4,0.007) df1<−data.frame(x1,x2,x3,x4) df1
输出
x1 x2 x3 x4 1 C 1.798039 483.1139 0.4105840 2 A 2.044579 574.3082 0.3970706 3 B 1.563161 517.0169 0.3983528 4 A 1.722058 546.6778 0.4049876 5 C 2.279675 527.9437 0.3959229 6 C 1.721763 549.1909 0.4096398 7 A 1.903552 486.4005 0.4091737 8 C 2.268382 502.0860 0.3998178 9 B 1.637324 540.1949 0.3970408 10 B 2.204911 433.7537 0.4003291 11 A 2.124006 521.2464 0.3935380 12 C 1.830028 525.4325 0.3937420 13 C 1.353421 553.7063 0.3974116 14 B 2.067984 547.9531 0.3958582 15 C 1.849775 537.0034 0.3986610 16 C 1.763957 531.4417 0.4015034 17 A 1.932413 545.7784 0.4215530 18 B 2.147503 491.7046 0.3918060 19 C 2.222639 563.4852 0.3902110 20 A 1.979280 483.2847 0.3820964
根据x1中的值查找列平均值:
Explore our latest online courses and learn new skills at your own pace. Enroll and become a certified expert to boost your career.
示例
aggregate(.~x1,data=df1,mean)
输出
x1 x2 x3 x4 1 A 1.950981 526.2827 0.4014032 2 B 1.924176 506.1246 0.3966774 3 C 1.898631 530.3782 0.3997215
示例2
y1<−sample(c("Male","Female"),20,replace=TRUE) y2<−rpois(20,2) y3<−rpois(20,5) y4<−rpois(20,3) y5<−rpois(20,2) y6<−rpois(20,1) y7<−rpois(20,4) df2<−data.frame(y1,y2,y3,y4,y5,y6,y7) df2
输出
y1 y2 y3 y4 y5 y6 y7 1 Male 1 6 1 1 1 4 2 Male 4 2 1 4 0 5 3 Male 2 6 5 3 1 2 4 Male 4 4 0 3 2 0 5 Male 1 2 6 2 2 2 6 Female 5 4 2 4 0 5 7 Male 2 4 3 1 2 3 8 Male 3 8 1 2 1 5 9 Male 0 3 2 1 2 5 10 Female 3 5 1 2 1 3 11 Female 2 5 3 2 1 2 12 Female 3 8 1 0 1 6 13 Female 1 2 5 2 1 3 14 Male 2 7 1 1 0 3 15 Male 3 5 2 4 2 4 16 Female 5 6 2 4 1 0 17 Female 2 3 1 0 1 3 18 Male 2 6 3 0 1 6 19 Female 2 6 5 1 0 0 20 Female 2 6 4 5 0 3
根据y1中的值查找列平均值:
aggregate(.~y1,data=df2,mean)
输出
y1 y2 y3 y4 y5 y6 y7 1 Female 2.777778 5.000000 2.666667 2.222222 0.6666667 2.777778 2 Male 2.181818 4.818182 2.272727 2.000000 1.2727273 3.545455
广告