如何使用 R 中的 dplyr 创建相对频次表?


相对频次是某项事物占总数的比例。例如,如果我们有 5 根香蕉、6 个番石榴和 10 个石榴,那么香蕉的相对频次就是 5 除以 5、6 和 10 的总和,即 21,因此它也可以称为成比例频次。

示例 1

 动态演示

考虑以下数据帧 −

set.seed(21)
x<−sample(LETTERS[1:4],20,replace=TRUE)
Ratings<−sample(1:50,20)
df1<−data.frame(x,Ratings)
df1

输出

x Ratings
1 C 44
2 A 29
3 C 14
4 A 10
5 B 46
6 C 1
7 D 47
8 A 8
9 C 23
10 C 7
11 D 50
12 B 31
13 B 34
14 B 3
15 D 48
16 B 33
17 C 45
18 B 9
19 B 40
20 C 21

加载 dplyr 包 −

library(dplyr)

寻找 x 中值的相对频次表 −

df1%>%group_by(x)%>%summarise(n=n())%>%mutate(freq=n/sum(n))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 4 x 3

输出

x n freq
<chr> <int> <dbl>
1 A 3 0.15
2 B 7 0.35
3 C 7 0.35
4 D 3 0.15
Warning message:
`...` is not empty.
We detected these problematic arguments:
* `needs_dots`
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?

注意 − 不要担心此警告消息,因为我们的问题已正确解决,且该警告与此无关。

示例 2

 动态演示

y<−sample(c("Male","Female"),20,replace=TRUE)
Salary<−sample(20000:50000,20)
df2<−data.frame(y,Salary)
df2

输出

   y Salary
1 Female 40907
2 Female 47697
3 Male 49419
4 Female 23818
5 Male 21585
6 Male 22276
7 Female 21856
8 Male 22092
9 Male 27892
10 Female 47655
11 Male 34933
12 Female 48027
13 Female 48179
14 Male 21460
15 Male 24233
16 Female 43762
17 Female 22369
18 Female 47206
19 Male 34972
20 Female 30222

寻找 y 中性别的相对频次 −

df2%>%group_by(y)%>%summarise(n=n())%>%mutate(freq=n/sum(n))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 3

输出

y n freq
<chr> <int> <dbl>
1 Female 11 0.55
2 Male 9 0.45
Warning message:
`...` is not empty.
We detected these problematic arguments:
* `needs_dots`
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?

更新于:07-11-2020

566 次浏览

开始你的 职业生涯

完成课程,获取认证

立即开始
广告