如何使用dplyr包在R数据框中根据另一列查找特定字符串在某列中的频率?
当我们在R数据框中拥有两个或多个分类列,其类别级别为字符串或数字作为字符串/整数时,我们可以根据另一列查找一个列的频率。这将帮助我们识别跨列频率,并可以理解一个分类列基于另一个列的分布。要使用dplyr包做到这一点,我们可以使用filter函数。
示例
考虑以下数据框:
Group<−sample(1:5,20,replace=TRUE)
Standard<−sample(c("I","II","III"),20,replace=TRUE)
df1<−data.frame(Group,Standard)
df1输出
Group Standard 1 3 III 2 5 III 3 5 I 4 3 I 5 2 II 6 4 II 7 3 III 8 2 I 9 1 II 10 4 III 11 3 II 12 4 III 13 4 III 14 4 III 15 4 III 16 4 III 17 5 III 18 3 II 19 5 III 20 1 III
查找标准的组频率:
library(dplyr) df1%>%filter(Standard=="I")%>%count(Group)
输出
Group n 1 2 1 2 3 1 3 5 1
示例
df1%>%filter(Standard=="II")%>%count(Group)
输出
Group n 1 1 1 2 2 1 3 3 2 4 4 1
示例
df1%>%filter(Standard=="III")%>%count(Group)
输出
Group n 1 1 1 2 3 2 3 4 6 4 5 3
让我们来看另一个例子:
Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Gender<−sample(c("Male","Female"),20,replace=TRUE)
df2<−data.frame(Gender,Class)
df2
输出
Gender Class 1 Female Third 2 Female First 3 Female Second 4 Male Third 5 Male Third 6 Female Second 7 Male First 8 Female Third 9 Female Second 10 Female Second 11 Female First 12 Female Second 13 Male First 14 Female Third 15 Female Third 16 Male Third 17 Male Third 18 Male Second 19 Female Second 20 Male Second df2%>%filter(Class=="Third")%>%count(Gender) Gender n 1 Female 4 2 Male 4 df2%>%filter(Class=="First")%>%count(Gender) Gender n 1 Female 2 2 Male 2 df2%>%filter(Class=="Second")%>%count(Gender) Gender n 1 Female 6 2 Male 2
广告
数据结构
网络
关系数据库管理系统 (RDBMS)
操作系统
Java
iOS
HTML
CSS
Android
Python
C语言编程
C++
C#
MongoDB
MySQL
Javascript
PHP