如何使用dplyr包在R数据框中根据另一列查找特定字符串在某列中的频率?


当我们在R数据框中拥有两个或多个分类列,其类别级别为字符串或数字作为字符串/整数时,我们可以根据另一列查找一个列的频率。这将帮助我们识别跨列频率,并可以理解一个分类列基于另一个列的分布。要使用dplyr包做到这一点,我们可以使用filter函数。

示例

 在线演示

考虑以下数据框:

Group<−sample(1:5,20,replace=TRUE)
Standard<−sample(c("I","II","III"),20,replace=TRUE)
df1<−data.frame(Group,Standard)
df1

输出

Group Standard
1 3 III
2 5 III
3 5 I
4 3 I
5 2 II
6 4 II
7 3 III
8 2 I
9 1 II
10 4 III
11 3 II
12 4 III
13 4 III
14 4 III
15 4 III
16 4 III
17 5 III
18 3 II
19 5 III
20 1 III

查找标准的组频率:

library(dplyr)
df1%>%filter(Standard=="I")%>%count(Group)

Explore our latest online courses and learn new skills at your own pace. Enroll and become a certified expert to boost your career.

输出

Group n
1 2 1
2 3 1
3 5 1

示例

df1%>%filter(Standard=="II")%>%count(Group)

输出

Group n
1 1 1
2 2 1
3 3 2
4 4 1

示例

df1%>%filter(Standard=="III")%>%count(Group)

输出

Group n
1 1 1
2 3 2
3 4 6
4 5 3

让我们来看另一个例子:

 在线演示

Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Gender<−sample(c("Male","Female"),20,replace=TRUE)
df2<−data.frame(Gender,Class)
df2

输出

Gender Class
1 Female Third
2 Female First
3 Female Second
4 Male Third
5 Male Third
6 Female Second
7 Male First
8 Female Third
9 Female Second
10 Female Second
11 Female First
12 Female Second
13 Male First
14 Female Third
15 Female Third
16 Male Third
17 Male Third
18 Male Second
19 Female Second
20 Male Second
df2%>%filter(Class=="Third")%>%count(Gender)
Gender n
1 Female 4
2 Male 4
df2%>%filter(Class=="First")%>%count(Gender)
Gender n
1 Female 2
2 Male 2
df2%>%filter(Class=="Second")%>%count(Gender)
Gender n
1 Female 6
2 Male 2

更新于:2020年11月7日

459 次浏览

开启你的职业生涯

完成课程获得认证

开始学习
广告