如何在R中按组标准化data.table对象的列?


要按组标准化data.table对象的列,我们可以使用scale函数并使用by函数提供分组列。

例如,如果我们有一个名为DT的data.table对象,其中包含两列G和Num,其中G是分组列,Num是数值列,那么我们可以使用下面给出的命令按G列标准化Num:

DT[,"Num":=as.vector(scale(Num)),by=G]

示例1

考虑下面的data.table对象:

library(data.table)
Grp<-sample(c("Male","Female"),20,replace=TRUE)
Response<-round(rnorm(20,5,1.25),2)
DT1<-data.table(Grp,Response)
DT1

创建了以下数据框

       Grp Response
 1: Female 5.31
 2: Male   5.20
 3: Female 6.38
 4: Male   4.53
 5: Female 4.90
 6: Female 4.78
 7: Male   3.73
 8: Female 6.19
 9: Male   4.33
10: Male   7.84
11: Male   6.70
12: Female 5.11
13: Male   6.80
14: Male   3.76
15: Male   3.56
16: Male   5.51
17: Female 6.58
18: Female 7.59
19: Male   4.62
20: Female 6.75

要在上面创建的数据框中按DT1中的Grp列标准化Response列,请将以下代码添加到上面的代码段:

library(data.table)
Grp<-sample(c("Male","Female"),20,replace=TRUE)
Response<-round(rnorm(20,5,1.25),2)
DT1<-data.table(Grp,Response)
DT1[,"Response":=as.vector(scale(Response)),by=Grp]
DT1

输出

如果您将上面给出的所有代码段作为一个程序执行,它将生成以下输出:

     Grp    Response
 1: Female -0.66313371
 2: Male    0.03955265
 3: Female  0.43789692
 4: Male   -0.43061348
 5: Female -1.08502396
 6: Female -1.20850403
 7: Male   -0.99200587
 8: Female  0.24238681
 9: Male   -0.57096158
10: Male    1.89214752
11: Male    1.09216337
12: Female -0.86893383
13: Male    1.16233742
14: Male   -0.97095365
15: Male   -1.11130175
16: Male    0.25709220
17: Female  0.64369704
18: Female  1.68298763
19: Male   -0.36745684
20: Female  0.81862714

示例2

以下代码段创建一个示例数据框:

Class<-sample(c("I","II","III"),20,replace=TRUE)
Rate<-round(rnorm(20,10,1.02),0)
DT2<-data.table(Class,Rate)
DT2

创建了以下数据框

  Class Rate
 1: II  10
 2: III  9
 3: II  10
 4: II  10
 5: III 10
 6: III  9
 7: III  8
 8: II  10
 9: II  11
10: III  9
11: I    9
12: II  11
13: III 13
14: II  10
15: III 12
16: I    8
17: II   9
18: I   10
19: III  9
20: II  10

要在上面创建的数据框中按DT2中的Class列标准化Rate列,请将以下代码添加到上面的代码段:

Class<-sample(c("I","II","III"),20,replace=TRUE)
Rate<-round(rnorm(20,10,1.02),0)
DT2<-data.table(Class,Rate)
DT2[,"Rate":=as.vector(scale(Rate)),by=Class]
DT2

输出

如果您将上面给出的所有代码段作为一个程序执行,它将生成以下输出:

   Class     Rate
 1: II  -0.18490007
 2: III -0.50669175
 3: II  -0.18490007
 4: II  -0.18490007
 5: III  0.07238454
 6: III -0.50669175
 7: III -1.08576803
 8: II  -0.18490007
 9: II   1.47920052
10: III -0.50669175
11: I    0.00000000
12: II   1.47920052
13: III  1.80961338
14: II  -0.18490007
15: III  1.23053710
16: I   -1.00000000
17: II  -1.84900065
18: I    1.00000000
19: III -0.50669175
20: II  -0.18490007

更新于:2021年11月10日

416 次浏览

启动您的职业生涯

完成课程获得认证

开始
广告