如何使用dplyr包在R数据框中根据另一列查找特定字符串在某列中的频率?
当我们在R数据框中拥有两个或多个分类列,其类别级别为字符串或数字作为字符串/整数时,我们可以根据另一列查找一个列的频率。这将帮助我们识别跨列频率,并可以理解一个分类列基于另一个列的分布。要使用dplyr包做到这一点,我们可以使用filter函数。
示例
考虑以下数据框:
Group<−sample(1:5,20,replace=TRUE) Standard<−sample(c("I","II","III"),20,replace=TRUE) df1<−data.frame(Group,Standard) df1
输出
Group Standard 1 3 III 2 5 III 3 5 I 4 3 I 5 2 II 6 4 II 7 3 III 8 2 I 9 1 II 10 4 III 11 3 II 12 4 III 13 4 III 14 4 III 15 4 III 16 4 III 17 5 III 18 3 II 19 5 III 20 1 III
查找标准的组频率:
library(dplyr) df1%>%filter(Standard=="I")%>%count(Group)
Explore our latest online courses and learn new skills at your own pace. Enroll and become a certified expert to boost your career.
输出
Group n 1 2 1 2 3 1 3 5 1
示例
df1%>%filter(Standard=="II")%>%count(Group)
输出
Group n 1 1 1 2 2 1 3 3 2 4 4 1
示例
df1%>%filter(Standard=="III")%>%count(Group)
输出
Group n 1 1 1 2 3 2 3 4 6 4 5 3
让我们来看另一个例子:
Class<−sample(c("First","Second","Third"),20,replace=TRUE) Gender<−sample(c("Male","Female"),20,replace=TRUE) df2<−data.frame(Gender,Class) df2
输出
Gender Class 1 Female Third 2 Female First 3 Female Second 4 Male Third 5 Male Third 6 Female Second 7 Male First 8 Female Third 9 Female Second 10 Female Second 11 Female First 12 Female Second 13 Male First 14 Female Third 15 Female Third 16 Male Third 17 Male Third 18 Male Second 19 Female Second 20 Male Second df2%>%filter(Class=="Third")%>%count(Gender) Gender n 1 Female 4 2 Male 4 df2%>%filter(Class=="First")%>%count(Gender) Gender n 1 Female 2 2 Male 2 df2%>%filter(Class=="Second")%>%count(Gender) Gender n 1 Female 6 2 Male 2
广告