如何在 R 中用第 5 和第 95 个百分位数的值替换异常值?


有多种方法可以定义异常值,研究人员和技术人员都可以手动设置它。此外,我们可以将第 5 个百分位数用作较低异常值,将第 95 个百分位数用作较高异常值。为此,我们可以使用 scales 包的 squish 函数,如下例所示。

示例 1

library(scales)
x1<−1:10
x1<−squish(x1,quantile(x1,c(.05,0.95)))
x1

输出

[1] 1.45 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 9.55

示例 2

 在线演示

x2<−c(−5,rnorm(78),5)
x2

输出

[1] −5.00000000 −0.39993096 −0.11249038 1.06589235 1.17195813 0.15677178
[7] −0.08325310 0.57986817 −0.05529031 0.13352083 1.00608625 −0.86860404
[13] 0.53672576 −0.15262216 −0.81247587 −0.31263625 −1.51127713 −1.59689010
[19] −0.11242962 −1.08234352 −0.04935398 −0.65185804 −1.10369370 0.68732306
[25] 1.83448401 1.08689945 −1.20674408 −1.25753553 0.03354570 0.67981025
[31] 0.24871123 −1.49969111 1.19287825 1.04406030 −1.31756416 0.10204579
[37] 1.48272096 0.97661717 0.50006441 −1.36247153 0.99895292 −0.49534106
[43] −0.24105508 0.35006991 −2.16041158 −1.12644863 2.23190981 −0.51413222
[49] 0.03760280 −1.12237961 −1.54094088 −0.37365780 0.02138277 1.97702046
[55] 0.37190626 −0.59456892 −0.06652980 −1.04453387 −0.50884324 0.85025142
[61] −0.66718350 −0.69703588 0.44922344 0.64238500 −1.11403189 0.66251032
[67] 0.79601219 −0.74801795 −0.10957126 −0.90781918 −2.13721781 1.43186180
[73] −0.32571115 −0.97929747 1.10822193 0.94719910 0.58934102 −1.29942407
[79] 3.83469537 5.00000000

示例

x2<−squish(x2,quantile(x2,c(.05,0.95)))
x2

输出

[1] −1.54373835 −0.39993096 −0.11249038 1.06589235 1.17195813 0.15677178
[7] −0.08325310 0.57986817 −0.05529031 0.13352083 1.00608625 −0.86860404
[13] 0.53672576 −0.15262216 −0.81247587 −0.31263625 −1.51127713 −1.54373835
[19] −0.11242962 −1.08234352 −0.04935398 −0.65185804 −1.10369370 0.68732306
[25] 1.83448401 1.08689945 −1.20674408 −1.25753553 0.03354570 0.67981025
[31] 0.24871123 −1.49969111 1.19287825 1.04406030 −1.31756416 0.10204579
[37] 1.48272096 0.97661717 0.50006441 −1.36247153 0.99895292 −0.49534106
[43] −0.24105508 0.35006991 −1.54373835 −1.12644863 1.84161083 −0.51413222
[49] 0.03760280 −1.12237961 −1.54094088 −0.37365780 0.02138277 1.84161083
[55] 0.37190626 −0.59456892 −0.06652980 −1.04453387 −0.50884324 0.85025142
[61] −0.66718350 −0.69703588 0.44922344 0.64238500 −1.11403189 0.66251032
[67] 0.79601219 −0.74801795 −0.10957126 −0.90781918 −1.54373835 1.43186180
[73] −0.32571115 −0.97929747 1.10822193 0.94719910 0.58934102 −1.29942407
[79] 1.84161083 1.84161083

示例 3

 在线演示

x3<−c(-50,rpois(198,5),50)
x3

输出

[1] −50 5 4 8 6 2 1 6 3 5 7 7 8 5 8 8 5 8
[19] 3 2 3 0 5 6 2 6 6 2 7 5 9 4 5 3 9 7
[37] 4 3 6 5 2 4 9 5 7 1 2 4 2 3 5 5 6 1
[55] 5 7 1 9 6 3 5 4 3 9 5 4 6 8 4 4 6 4
[73] 5 2 4 5 5 7 8 6 3 5 8 5 8 5 2 5 2 8
[91] 6 6 5 7 2 2 5 5 4 3 5 3 7 2 4 6 8 6
[109] 3 4 9 2 2 2 4 4 6 6 5 5 3 5 3 6 6 4
[127] 6 4 4 5 9 6 2 1 3 8 5 7 5 6 6 5 7 2
[145] 8 8 6 5 3 4 5 10 6 6 3 6 2 7 7 5 8 7
[163] 7 3 4 8 4 4 6 8 3 6 4 10 4 3 5 4 4 5
[181] 4 5 4 5 4 5 6 8 2 5 12 12 3 6 5 4 4 5
[199] 5 50

示例

x3<−squish(x3,quantile(x3,c(.05,0.95)))
x3

输出

[1] 2 5 4 8 6 2 2 6 3 5 7 7 8 5 8 8 5 8 3 2 3 2 5 6 2 6 6 2 7 5 9 4 5 3 9 7 4
[38] 3 6 5 2 4 9 5 7 2 2 4 2 3 5 5 6 2 5 7 2 9 6 3 5 4 3 9 5 4 6 8 4 4 6 4 5 2
[75] 4 5 5 7 8 6 3 5 8 5 8 5 2 5 2 8 6 6 5 7 2 2 5 5 4 3 5 3 7 2 4 6 8 6 3 4 9
[112] 2 2 2 4 4 6 6 5 5 3 5 3 6 6 4 6 4 4 5 9 6 2 2 3 8 5 7 5 6 6 5 7 2 8 8 6 5
[149] 3 4 5 9 6 6 3 6 2 7 7 5 8 7 7 3 4 8 4 4 6 8 3 6 4 9 4 3 5 4 4 5 4 5 4 5 4
[186] 5 6 8 2 5 9 9 3 6 5 4 4 5 5 9

示例 4

 在线演示

x4<−c(−50,rexp(48,3.1),50)
x4

输出

[1] −50.00000000 0.46067329 0.15298747 0.22637363 0.23424447
[6] 0.15467335 0.37455989 0.07762013 0.33175821 0.09303333
[11] 0.03806199 0.20649621 0.22883480 0.49089164 0.82497712
[16] 0.04780089 0.05156566 0.35638257 0.37319578 0.71100713
[21] 0.08649528 0.31543159 0.02263685 0.00963146 0.44814049
[26] 0.34506738 0.29533295 0.13803055 0.05497129 0.03901786
[31] 0.01818446 0.78122217 0.04863415 0.33353520 0.39530353
[36] 0.05385106 0.19991695 0.16913554 0.01549729 0.15901185
[41] 0.65120205 0.36483214 0.18226180 0.20708671 0.01590697
[46] 1.01257680 0.42223292 0.17291614 0.15793390 50.00000000

示例

x4<−squish(x4,quantile(x4,c(.05,0.95)))
x4

输出

[1] 0.01568165 0.46067329 0.15298747 0.22637363 0.23424447 0.15467335
[7] 0.37455989 0.07762013 0.33175821 0.09303333 0.03806199 0.20649621
[13] 0.22883480 0.49089164 0.80528739 0.04780089 0.05156566 0.35638257
[19] 0.37319578 0.71100713 0.08649528 0.31543159 0.02263685 0.01568165
[25] 0.44814049 0.34506738 0.29533295 0.13803055 0.05497129 0.03901786
[31] 0.01818446 0.78122217 0.04863415 0.33353520 0.39530353 0.05385106
[37] 0.19991695 0.16913554 0.01568165 0.15901185 0.65120205 0.36483214
[43] 0.18226180 0.20708671 0.01590697 0.80528739 0.42223292 0.17291614
[49] 0.15793390 0.80528739

更新于:08-Feb-2021

308 位浏览

开启你的 职业

获得认证,完成课程

开始使用
广告