如何从 R 数据框列中提取不以特定字符开头和结尾的值?
有时我们只想根据列的初始值和结束值提取数据列的值,该列包含字符串,或者有时包含字符串的列的值以一些额外的字符记录,我们想提取这些值。为此,我们可以使用带单个方括号的 grepl 的否定。
示例
考虑以下数据框 -
> x2<-c("Alabama", "Alaska", "American Samoa", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Guam", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota", "Minor Outlying Islands", "Mississippi", "Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Northern Mariana Islands", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Puerto Rico", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "U.S. Virgin Islands", "Utah", "Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming")
> df2<-data.frame(x2) > head(df2,20)
输出
x2 1 Alabama 2 Alaska 3 American Samoa 4 Arizona 5 Arkansas 6 California 7 Colorado 8 Connecticut 9 Delaware 10 District of Columbia 11 Florida 12 Georgia 13 Guam 14 Hawaii 15 Idaho 16 Illinois 17 Indiana 18 Iowa 19 Kansas 20 Kentucky
查找既不以 A 开头也不以 a 结尾的州 -
> df2[!grepl("^A|a$",df2$x2),]
输出
[1] Colorado Connecticut Delaware [4] Guam Hawaii Idaho [7] Illinois Kansas Kentucky [10] Maine Maryland Massachusetts [13] Michigan Minor Outlying Islands Mississippi [16] Missouri New Hampshire New Jersey [19] New Mexico New York Northern Mariana Islands [22] Ohio Oregon Puerto Rico [25] Rhode Island Tennessee Texas [28] U.S. Virgin Islands Utah Vermont [31] Washington Wisconsin Wyoming 57 Levels: Alabama Alaska American Samoa Arizona Arkansas ... Wyoming
让我们看看另一个例子 -
> x1<- c("Indiaaa","Chinaaa","Russiaa","Canadaaa","Indonesiaaa","Croatiaaa","Mauritaniaaa"," Albaniaaa","Angolaaa","Armeniaaa","Malaysiaaa","Maltaaa","Boliviaaa","Burmaaa","Pa nama","Romaniaa","Saudi-Arabia","Serbiaaa","Syriaaa","Tongaaa","Koreaaa","Libya")
> y1<-sample(1:10,22,replace=TRUE) > df1<-data.frame(x1,y1) > df1
输出
x1 y1 1 Indiaaa 6 2 Chinaaa 1 3 Russiaa 9 4 Canadaaa 7 5 Indonesiaaa 7 6 Croatiaaa 3 7 Mauritaniaaa 6 8 Albaniaaa 2 9 Angolaaa 10 10 Armeniaaa 10 11 Malaysiaaa 7 12 Maltaaa 3 13 Boliviaaa 2 14 Burmaaa 10 15 Panama 1 16 Romaniaa 10 17 Saudi-Arabia 10 18 Serbiaaa 8 19 Syriaaa 10 20 Tongaaa 5 21 Koreaaa 7 22 Libya 8
> df1[!grepl("^A|aa$",df1$x1),]
输出
x1 y1 15 Panama 1 17 Saudi-Arabia 10 22 Libya 8
> df1[!grepl("^S|aa$",df1$x1),]
输出
x1 y1 15 Panama 1 22 Libya 8
> df1[!grepl("^B|aa$",df1$x1),]
输出
x1 y1 15 Panama 1 17 Saudi-Arabia 10 22 Libya 8
> df1[!grepl("^P|aa$",df1$x1),]
输出
x1 y1 17 Saudi-Arabia 10 22 Libya 8
> df1[!grepl("^L|aa$",df1$x1),]
输出
x1 y1 15 Panama 1 17 Saudi-Arabia 10
广告