如何在 R 中将数值向量的元素切割为多个区间?


一个数值向量可能包含大量元素;因此,我们可能希望将该向量转换为一个区间向量。例如,如果我们在一个向量中有 1 到 10 的值,那么我们可能希望将该向量转换为一个区间向量,如 (1,5) 表示 1、2、3、4 和 5,以及 (6,10) 表示 6、7、8、9、10。这可以通过使用 cut 函数来完成,我们在该函数中将使用 breaks 参数将向量元素组合在一个区间内。

示例

 实时演示

> x1<-rep(1:10,times=5)
> x1

输出

[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5
[26] 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
> cut(x1,breaks=c(0,10),include.lowest=TRUE)

输出

[1] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10]
[11] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10]
[21] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10]
[31] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10]
[41] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10] [0,10]
Levels: [0,10]
> cut(x1,breaks=c(0,5,10),include.lowest=TRUE)

输出

[1] [0,5] [0,5] [0,5] [0,5] [0,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[11] [0,5] [0,5] [0,5] [0,5] [0,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[21] [0,5] [0,5] [0,5] [0,5] [0,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[31] [0,5] [0,5] [0,5] [0,5] [0,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[41] [0,5] [0,5] [0,5] [0,5] [0,5] (5,10] (5,10] (5,10] (5,10] (5,10]
Levels: [0,5] (5,10]
> cut(x1,breaks=c(0,2,5,10),include.lowest=TRUE)

输出

[1] [0,2] [0,2] (2,5] (2,5] (2,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[11] [0,2] [0,2] (2,5] (2,5] (2,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[21] [0,2] [0,2] (2,5] (2,5] (2,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[31] [0,2] [0,2] (2,5] (2,5] (2,5] (5,10] (5,10] (5,10] (5,10] (5,10]
[41] [0,2] [0,2] (2,5] (2,5] (2,5] (5,10] (5,10] (5,10] (5,10] (5,10]
Levels: [0,2] (2,5] (5,10]

实例

 实时演示

> x2<-sample(1:50,100,replace=TRUE)
> x2

输出

[1] 13 7 48 13 31 27 19 40 10 42 13 6 12 18 14 21 49 20 10 21 39 39 42 11 10
[26] 39 24 9 7 2 46 11 34 40 9 16 47 23 43 39 12 37 22 6 17 42 13 10 13 34
[51] 48 48 31 3 26 27 24 1 33 41 16 17 8 34 12 12 50 39 44 46 26 29 9 43 10
[76] 47 41 25 50 18 8 50 29 8 20 46 48 9 13 21 45 7 6 38 4 37 28 9 46 43
> cut(x2,breaks=c(0,10,20,30,40,50),include.lowest=TRUE)

输出

[1] (10,20] [0,10] (40,50] (10,20] (30,40] (20,30] (10,20] (30,40] [0,10]
[10] (40,50] (10,20] [0,10] (10,20] (10,20] (10,20] (20,30] (40,50] (10,20]
[19] [0,10] (20,30] (30,40] (30,40] (40,50] (10,20] [0,10] (30,40] (20,30]
[28] [0,10] [0,10] [0,10] (40,50] (10,20] (30,40] (30,40] [0,10] (10,20]
[37] (40,50] (20,30] (40,50] (30,40] (10,20] (30,40] (20,30] [0,10] (10,20]
[46] (40,50] (10,20] [0,10] (10,20] (30,40] (40,50] (40,50] (30,40] [0,10]
[55] (20,30] (20,30] (20,30] [0,10] (30,40] (40,50] (10,20] (10,20] [0,10]
[64] (30,40] (10,20] (10,20] (40,50] (30,40] (40,50] (40,50] (20,30] (20,30]
[73] [0,10] (40,50] [0,10] (40,50] (40,50] (20,30] (40,50] (10,20] [0,10]
[82] (40,50] (20,30] [0,10] (10,20] (40,50] (40,50] [0,10] (10,20] (20,30]
[91] (40,50] [0,10] [0,10] (30,40] [0,10] (30,40] (20,30] [0,10] (40,50]
[100] (40,50]
Levels: [0,10] (10,20] (20,30] (30,40] (40,50]
> cut(x2,breaks=c(0,25,50),include.lowest=TRUE)

输出

[1] [0,25] [0,25] (25,50] [0,25] (25,50] (25,50] [0,25] (25,50] [0,25]
[10] (25,50] [0,25] [0,25] [0,25] [0,25] [0,25] [0,25] (25,50] [0,25]
[19] [0,25] [0,25] (25,50] (25,50] (25,50] [0,25] [0,25] (25,50] [0,25]
[28] [0,25] [0,25] [0,25] (25,50] [0,25] (25,50] (25,50] [0,25] [0,25]
[37] (25,50] [0,25] (25,50] (25,50] [0,25] (25,50] [0,25] [0,25] [0,25]
[46] (25,50] [0,25] [0,25] [0,25] (25,50] (25,50] (25,50] (25,50] [0,25]
[55] (25,50] (25,50] [0,25] [0,25] (25,50] (25,50] [0,25] [0,25] [0,25]
[64] (25,50] [0,25] [0,25] (25,50] (25,50] (25,50] (25,50] (25,50] (25,50]
[73] [0,25] (25,50] [0,25] (25,50] (25,50] [0,25] (25,50] [0,25] [0,25]
[82] (25,50] (25,50] [0,25] [0,25] (25,50] (25,50] [0,25] [0,25] [0,25]
[91] (25,50] [0,25] [0,25] (25,50] [0,25] (25,50] (25,50] [0,25] (25,50]
[100] (25,50]
Levels: [0,25] (25,50]

实例

 实时演示

> x3<-rnorm(50)
> x3

输出

[1] -1.64517642 -0.01272833 0.91663842 -0.09876889 0.55948078 -0.18988625
[7] -0.01549091 0.38267434 0.41934828 0.29722536 -0.26682682 1.96459180
[13] 0.80592720 0.21291731 1.40838981 0.76851455 -0.78332882 -1.36524134
[19] 0.41831139 0.56214779 -0.02624205 0.82156347 1.61126016 -1.90119281
[25] -0.74483003 1.23303337 -0.01098453 1.41730325 -0.32065348 0.47327116
[31] -1.15282707 0.40602932 0.50278613 0.14055559 0.12118253 0.92430501
[37] 0.96296365 -1.46332577 -1.28307015 1.79576714 -0.36234303 -0.15216707
[43] -0.35560809 -1.50113319 1.24395907 2.10404794 0.38022634 -0.22089852
[49] 0.49641952 2.08534151
> cut(x3,breaks=c(-3,-2,-1,0,1,2,3),include.lowest=TRUE)

输出

[1] (-2,-1] (-1,0] (0,1] (-1,0] (0,1] (-1,0] (-1,0] (0,1] (0,1]
[10] (0,1] (-1,0] (1,2] (0,1] (0,1] (1,2] (0,1] (-1,0] (-2,-1]
[19] (0,1] (0,1] (-1,0] (0,1] (1,2] (-2,-1] (-1,0] (1,2] (-1,0]
[28] (1,2] (-1,0] (0,1] (-2,-1] (0,1] (0,1] (0,1] (0,1] (0,1]
[37] (0,1] (-2,-1] (-2,-1] (1,2] (-1,0] (-1,0] (-1,0] (-2,-1] (1,2]
[46] (2,3] (0,1] (-1,0] (0,1] (2,3]
Levels: [-3,-2] (-2,-1] (-1,0] (0,1] (1,2] (2,3]

实例

 实时演示

> x4<-rexp(50)
> x4

输出

[1] 0.36828906 0.99410752 1.60327543 1.00733906 0.04192489 0.27703294
[7] 1.01972666 0.13404797 1.61118488 0.56973146 1.14849062 1.26478100
[13] 1.96776564 0.28164489 2.20068898 0.53786708 0.05716903 1.17060241
[19] 4.11329046 0.05836550 2.03377736 1.51988907 1.31311807 1.12480807
[25] 2.39660434 1.53262673 1.53017418 0.93182793 0.64312828 1.98225892
[31] 2.29062631 2.04986737 0.20598660 0.05072401 0.11331514 0.05711355
[37] 0.46356027 0.01115845 0.06631682 0.35291485 0.07836249 0.02739561
[43] 0.69748192 0.39024496 1.59251440 0.15721042 0.27359071 1.67332810
[49] 0.85041291 0.36395538
> cut(x4,breaks=c(0,2,4,6),include.lowest=TRUE)

输出

[1] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2]
[13] [0,2] [0,2] (2,4] [0,2] [0,2] [0,2] (4,6] [0,2] (2,4] [0,2] [0,2] [0,2]
[25] (2,4] [0,2] [0,2] [0,2] [0,2] [0,2] (2,4] (2,4] [0,2] [0,2] [0,2] [0,2]
[37] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2] [0,2]
[49] [0,2] [0,2]
Levels: [0,2] (2,4] (4,6]
5
[1] 4.648796 4.536560 3.420823 4.725203 3.960189 4.016633 4.089574 2.858532
[9] 3.820272 4.162882 3.765825 3.105139 4.569274 3.305300 3.848793 2.243246
[17] 2.956504 2.123237 4.966899 3.060593 4.545236 4.799034 3.242637 3.594802
[25] 3.984293 2.778013 4.690801 4.748652 3.197996 2.931886 4.604935 3.443908
[33] 4.290655 2.898870 2.667637 3.438125 2.137097 2.596239 4.573550 4.062213
[41] 4.990185 4.437294 4.802661 4.250570 3.682694 4.631286 3.975588 3.249041
[49] 4.993362 4.235411
> cut(x5,breaks=c(1,2,3,4,5),include.lowest=TRUE)

输出

[1] (4,5] (4,5] (3,4] (4,5] (3,4] (4,5] (4,5] (2,3] (3,4] (4,5] (3,4] (3,4]
[13] (4,5] (3,4] (3,4] (2,3] (2,3] (2,3] (4,5] (3,4] (4,5] (4,5] (3,4] (3,4]
[25] (3,4] (2,3] (4,5] (4,5] (3,4] (2,3] (4,5] (3,4] (4,5] (2,3] (2,3] (3,4]
[37] (2,3] (2,3] (4,5] (4,5] (4,5] (4,5] (4,5] (4,5] (3,4] (4,5] (3,4] (3,4]
[49] (4,5] (4,5]
Levels: [1,2] (2,3] (3,4] (4,5]

更新于:04-Sep-2020

362 次浏览

开启你的 职业生涯

完成课程后即可获取认证

开始
广告