如何在 R 数据框中使用两个因子列查找累积和?
通常,累积和是针对单个变量计算的,在某些情况下,基于单个分类变量,很少有情况我们需要针对两个分类变量进行计算。如果我们想要针对两个分类变量查找累积和,则需要将数据框转换为 data.table 对象,并使用 cumsum 函数定义包含累积和的列。
示例
考虑以下数据框
> set.seed(1361) > Factor1<-as.factor(sample(LETTERS[1:4],20,replace=TRUE)) > Factor2<-as.factor(sample(c("T1","T2","T3","T4"),20,replace=TRUE)) > Response<-rpois(20,5) > df1<-data.frame(Factor1,Factor2,Response) > df1
输出
Factor1 Factor2 Response 1 A T2 9 2 B T1 8 3 B T1 2 4 A T2 3 5 B T1 7 6 B T2 7 7 D T2 7 8 D T4 7 9 C T4 6 10 B T1 6 11 A T2 4 12 A T2 4 13 C T1 7 14 B T3 1 15 A T3 6 16 D T1 3 17 B T1 8 18 D T4 5 19 D T2 3 20 C T1 4
加载 data.table 包
> library(data.table)
将数据框 df1 转换为 data.table 对象
> dt1<-data.table(df1)
基于 Factor1 和 Factor2 创建一个名为 CumulativeSums 的列,其中包含累积和
示例
> dt1[,CumulativeSums:=cumsum(Response),by=list(Factor1,Factor2)] > dt1
输出
Factor1 Factor2 Response CumulativeSums 1: A T2 9 9 2: B T1 8 8 3: B T1 2 10 4: A T2 3 12 5: B T1 7 17 6: B T2 7 7 7: D T2 7 7 8: D T4 7 7 9: C T4 6 6 10: B T1 6 23 11: A T2 4 16 12: A T2 4 20 13: C T1 7 7 14: B T3 1 1 15: A T3 6 6 16: D T1 3 3 17: B T1 8 31 18: D T4 5 12 19: D T2 3 10 20: C T1 4 11
让我们看看另一个例子
示例
> G1<-as.factor(sample(c("Hot","Cold"),20,replace=TRUE)) > G2<-as.factor(sample(c("Low","Medium","Large"),20,replace=TRUE)) > Y<-sample(1:100,20) > df2<-data.frame(G1,G2,Y) > df2
输出
G1 G2 Y 1 Hot Medium 60 2 Cold Low 94 3 Hot Low 22 4 Cold Medium 90 5 Hot Medium 16 6 Hot Large 32 7 Cold Low 44 8 Hot Low 73 9 Hot Medium 99 10 Hot Medium 68 11 Cold Medium 41 12 Cold Large 77 13 Cold Large 48 14 Cold Medium 20 15 Cold Medium 18 16 Cold Low 12 17 Cold Low 30 18 Hot Low 23 19 Cold Medium 26 20 Cold Medium 4
示例
> dt2<-data.table(df2) > dt2[,CumulativeSums:=cumsum(Y),by=list(G1,G2)] > dt2
输出
G1 G2 Y CumulativeSums 1: Hot Medium 60 60 2: Cold Low 94 94 3: Hot Low 22 22 4: Cold Medium 90 90 5: Hot Medium 16 76 6: Hot Large 32 32 7: Cold Low 44 138 8: Hot Low 73 95 9: Hot Medium 99 175 10: Hot Medium 68 243 11: Cold Medium 41 131 12: Cold Large 77 77 13: Cold Large 48 125 14: Cold Medium 20 151 15: Cold Medium 18 169 16: Cold Low 12 150 17: Cold Low 30 180 18: Hot Low 23 118 19: Cold Medium 26 195 20: Cold Medium 4 199
广告