如何根据两列字符串匹配以及OR条件筛选R数据框?
要根据两列中的字符串匹配以及OR条件筛选R数据框,我们可以使用`grepl`函数,结合双括号和OR运算符`|`。例如,如果我们有一个名为`df`的数据框,其中包含两列字符串(例如x和y),那么可以通过以下方法基于任何一列中的特定字符串匹配进行筛选:
语法
df[grepl("text",df[["x"]])|grepl("text",df[["y"]]),]
查看下面的例子来了解它是如何工作的。
示例1
考虑以下数据框:
f1<-sample(c("India","China","Egypt","UK"),20,replace=TRUE) f2<-sample(c("India","China","Egypt","UK"),20,replace=TRUE) v1<-rnorm(20) df1<-data.frame(f1,f2,v1) df1
输出
f1 f2 v1 1 India India 0.58383357 2 UK Egypt -0.71045054 3 India China -0.07848666 4 Egypt India 1.21017481 5 Egypt UK -0.81991817 6 Egypt China 1.98979283 7 India India 0.36160374 8 Egypt China -1.77619986 9 China UK -0.05397712 10 India Egypt -0.30372078 11 Egypt India -1.68623489 12 India India -0.41997104 13 India China -0.97064798 14 UK Egypt 2.02704796 15 UK Egypt -0.47732133 16 China China 0.53153059 17 Egypt UK -1.71608164 18 Egypt India -0.73298689 19 UK UK 1.83674440 20 China China -1.12186527
基于在任意前两列中匹配“India”来筛选`df1`:
df1<-df1[grepl("India",df1[["f1"]])|grepl("India",df1[["f2"]]),] df1
f1 f2 v1 1 India India 0.58383357 3 India China -0.07848666 4 Egypt India 1.21017481 7 India India 0.36160374 10 India Egypt -0.30372078 11 Egypt India -1.68623489 12 India India -0.41997104 13 India China -0.97064798 18 Egypt India -0.73298689
示例2
g1<-sample(c("Male","Female"),20,replace=TRUE) g2<-sample(c("Male","Female"),20,replace=TRUE) v2<-rpois(20,5) df2<-data.frame(g1,g2) df2
输出
g1 g2 1 Female Male 2 Female Male 3 Female Female 4 Male Male 5 Male Female 6 Female Female 7 Female Male 8 Male Male 9 Male Female 10 Male Female 11 Female Female 12 Male Male 13 Male Male 14 Male Female 15 Female Male 16 Female Male 17 Female Male 18 Male Female 19 Female Female 20 Male Female
基于在任意前两列中匹配“Female”来筛选`df2`:
df2<-df2[grepl("Female",df2[["g2"]])|grepl("Female",df2[["g2"]]),] df2
g1 g2 3 Female Female 5 Male Female 6 Female Female 9 Male Female 10 Male Female 11 Female Female 14 Male Female 18 Male Female 19 Female Female 20 Male Female
广告