如何在 R 数据框中添加一列来表示组的百分比?
在数据分析中,我们经常需要找到数据组中存在的数值的百分比。这有助于我们了解哪个值出现频率高,哪个值出现频率低。此外,还可以通过饼图绘制百分比,从而使读者更好地了解数据。如果我们可以使用 dplyr 包的 mutate 函数,添加一列作为组的百分比并不是什么挑战,这里您将获得相应的示例。
示例1
> Group<-rep(1:2,each=5) > Frequency<-sample(1:100,10) > df1<-data.frame(Group,Frequency) > df1
输出
Group Frequency 1 1 67 2 1 58 3 1 54 4 1 13 5 1 23 6 2 91 7 2 3 8 2 95 9 2 38 10 2 48
> library(dplyr)
查找组中每个组值的百分比 -
> df1%>%group_by(Group)%>%mutate(Percentage=paste0(round(Frequency/sum(Frequency)*100,2),"%")) # A tibble: 10 x 3 # Groups: Group [2]
输出
Group Frequency Percentage <int> <int> <chr> 1 1 67 31.16% 2 1 58 26.98% 3 1 54 25.12% 4 1 13 6.05% 5 1 23 10.7% 6 2 91 33.09% 7 2 3 1.09% 8 2 95 34.55% 9 2 38 13.82% 10 2 48 17.45%
示例2
> Gender<-rep(c("Male","Female"),each=5) > Salary<-sample(25000:50000,10) > df2<-data.frame(Gender,Salary) > df2
输出
Gender Salary 1 Male 41734 2 Male 39035 3 Male 36161 4 Male 33437 5 Male 45123 6 Female 44492 7 Female 48456 8 Female 31569 9 Female 35110 10 Female 43630
>df2%>%group_by(Gender)%>%mutate(Percentage=paste0(round(Salary/sum(Salary)*1 00,2),"%")) # A tibble: 10 x 3 # Groups: Gender [2]
输出
Gender Salary Percentage <fct> <int> <chr> 1 Male 41734 21.35% 2 Male 39035 19.97% 3 Male 36161 18.5% 4 Male 33437 17.1% 5 Male 45123 23.08% 6 Female 44492 21.89% 7 Female 48456 23.84% 8 Female 31569 15.53% 9 Female 35110 17.27% 10 Female 43630 21.47%
示例3
> Grade<-rep(c("A","B","C","D","E"),each=2) > Number_of_Years_in_Job<-sample(1:5,10,replace=TRUE) > df3<-data.frame(Grade,Number_of_Years_in_Job) > df3
输出
Grade Number_of_Years_in_Job 1 A 4 2 A 5 3 B 4 4 B 4 5 C 1 6 C 4 7 D 1 8 D 1 9 E 3 10 E 1
>df3%>%group_by(Grade)%>%mutate(Percentage=paste0(round(Number_of_Years_in_J ob/sum(Number_of_Years_in_Job)*100,2),"%")) # A tibble: 10 x 3 # Groups: Grade [5]
输出
Grade Number_of_Years_in_Job Percentage <fct> <int> <chr> 1 A 4 44.44% 2 A 5 55.56% 3 B 4 50% 4 B 4 50% 5 C 1 20% 6 C 4 80% 7 D 1 50% 8 D 1 50% 9 E 3 75% 10 E 1 25%
广告