如何在 R 向量中使用线性插值法替换缺失值?
线性插值是一种使用线性多项式拟合曲线的算法,它可以帮助我们创建新的数据点,但这些点位于进行线性插值的原始值的范围内。有时这些值可能会稍微偏离原始值,但不会太远。在 R 中,如果我们有一些缺失值,则可以使用 zoo 包中的 na.approx 函数用线性插值法替换 NA。
示例 1
加载 zoo 包
> library(zoo) > x1<-sample(c(NA,2,5),10,replace=TRUE) > x1
输出
[1] 2 2 2 5 2 2 5 NA 2 5
用线性插值替换 NA
示例
> na.approx(x1)
输出
[1] 2.0 2.0 2.0 5.0 2.0 2.0 5.0 3.5 2.0 5.0
示例 2
> x2<-sample(c(NA,1:4),150,replace=TRUE) > x2
输出
[1] 2 NA NA 2 1 1 NA 2 4 NA 1 2 1 4 3 3 1 3 1 4 4 2 3 1 3 [26] 1 4 2 4 2 1 2 1 3 NA 2 NA 3 1 2 3 3 3 2 4 4 3 3 4 3 [51] 1 4 3 1 4 NA NA NA 2 NA 3 4 NA 2 3 3 1 4 2 4 NA NA 4 3 2 [76] 3 NA 3 NA 4 3 2 3 NA 3 1 1 3 2 NA 1 3 3 NA 3 NA 2 NA 4 1 [101] NA 2 2 4 3 NA 4 NA 2 2 NA 3 2 NA NA 3 NA 3 1 NA 1 NA 1 NA 1 [126] 2 1 3 4 1 4 2 3 NA 3 NA NA 4 NA 2 NA 4 2 3 NA 1 2 1 3 4
示例
> na.approx(x2)
输出
[1] 2.000000 2.000000 2.000000 2.000000 1.000000 1.000000 1.500000 2.000000 [9] 4.000000 2.500000 1.000000 2.000000 1.000000 4.000000 3.000000 3.000000 [17] 1.000000 3.000000 1.000000 4.000000 4.000000 2.000000 3.000000 1.000000 [25] 3.000000 1.000000 4.000000 2.000000 4.000000 2.000000 1.000000 2.000000 [33] 1.000000 3.000000 2.500000 2.000000 2.500000 3.000000 1.000000 2.000000 [41] 3.000000 3.000000 3.000000 2.000000 4.000000 4.000000 3.000000 3.000000 [49] 4.000000 3.000000 1.000000 4.000000 3.000000 1.000000 4.000000 3.500000 [57] 3.000000 2.500000 2.000000 2.500000 3.000000 4.000000 3.000000 2.000000 [65] 3.000000 3.000000 1.000000 4.000000 2.000000 4.000000 4.000000 4.000000 [73] 4.000000 3.000000 2.000000 3.000000 3.000000 3.000000 3.500000 4.000000 [81] 3.000000 2.000000 3.000000 3.000000 3.000000 1.000000 1.000000 3.000000 [89] 2.000000 1.500000 1.000000 3.000000 3.000000 3.000000 3.000000 2.500000 [97] 2.000000 3.000000 4.000000 1.000000 1.500000 2.000000 2.000000 4.000000 [105] 3.000000 3.500000 4.000000 3.000000 2.000000 2.000000 2.500000 3.000000 [113] 2.000000 2.333333 2.666667 3.000000 3.000000 3.000000 1.000000 1.000000 [121] 1.000000 1.000000 1.000000 1.000000 1.000000 2.000000 1.000000 3.000000 [129] 4.000000 1.000000 4.000000 2.000000 3.000000 3.000000 3.000000 3.333333 [137] 3.666667 4.000000 3.000000 2.000000 3.000000 4.000000 2.000000 3.000000 [145] 2.000000 1.000000 2.000000 1.000000 3.000000 4.000000
示例 3
> x3<-sample(c(NA,rnorm(5)),80,replace=TRUE) > x3
输出
[1] -0.7419539 -0.7419539 -0.7419539 -0.7419539 NA -0.2225833 [7] -0.7240064 0.8134500 -0.2225833 -0.2225833 0.8134500 -0.7419539 [13] -0.7240064 -0.7419539 -0.7240064 -0.7419539 -0.7240064 0.7383318 [19] NA -0.7240064 0.7383318 0.7383318 NA 0.8134500 [25] -0.2225833 -0.7419539 -0.2225833 0.8134500 0.8134500 NA [31] -0.2225833 -0.2225833 -0.7240064 -0.2225833 0.7383318 NA [37] NA -0.7419539 -0.7240064 -0.7240064 -0.7419539 0.7383318 [43] 0.8134500 -0.7240064 0.7383318 0.8134500 0.7383318 0.8134500 [49] 0.7383318 -0.7240064 -0.2225833 -0.7240064 -0.7240064 -0.7240064 [55] 0.7383318 0.7383318 NA -0.2225833 -0.7419539 -0.7419539 [61] 0.8134500 -0.2225833 -0.2225833 0.7383318 -0.2225833 0.8134500 [67] -0.2225833 0.7383318 -0.7240064 0.7383318 NA -0.2225833 [73] 0.7383318 -0.7419539 0.8134500 -0.2225833 NA -0.7240064 [79] -0.2225833 -0.2225833
示例
> na.approx(x3)
输出
[1] -0.741953856 -0.741953856 -0.741953856 -0.741953856 -0.482268589 [6] -0.222583323 -0.724006386 0.813450002 -0.222583323 -0.222583323 [11] 0.813450002 -0.741953856 -0.724006386 -0.741953856 -0.724006386 [16] -0.741953856 -0.724006386 0.738331799 0.007162706 -0.724006386 [21] 0.738331799 0.738331799 0.775890900 0.813450002 -0.222583323 [26] -0.741953856 -0.222583323 0.813450002 0.813450002 0.295433340 [31] -0.222583323 -0.222583323 -0.724006386 -0.222583323 0.738331799 [36] 0.244903247 -0.248525304 -0.741953856 -0.724006386 -0.724006386 [41] -0.741953856 0.738331799 0.813450002 -0.724006386 0.738331799 [46] 0.813450002 0.738331799 0.813450002 0.738331799 -0.724006386 [51] -0.222583323 -0.724006386 -0.724006386 -0.724006386 0.738331799 [56] 0.738331799 0.257874238 -0.222583323 -0.741953856 -0.741953856 [61] 0.813450002 -0.222583323 -0.222583323 0.738331799 -0.222583323 [66] 0.813450002 -0.222583323 0.738331799 -0.724006386 0.738331799 [71] 0.257874238 -0.222583323 0.738331799 -0.741953856 0.813450002 [76] -0.222583323 -0.473294855 -0.724006386 -0.222583323 -0.222583323
示例 4
> x4<-sample(c(NA,rpois(20,2)),100,replace=TRUE) > x4
输出
[1] 3 3 0 2 NA 2 2 2 1 NA 0 1 3 3 3 3 1 1 3 3 1 2 1 1 2 [26] 3 5 5 0 2 1 1 3 2 1 3 2 NA 3 3 0 0 3 3 6 2 3 3 2 3 [51] 3 2 0 NA 2 NA 3 5 NA 0 3 1 5 2 1 NA 3 3 3 2 2 6 5 2 1 [76] 2 1 5 2 3 NA 0 0 2 2 2 0 5 2 3 6 0 3 3 3 3 2 2 3 1
示例
> na.approx(x4)
输出
[1] 3.0 3.0 0.0 2.0 2.0 2.0 2.0 2.0 1.0 0.5 0.0 1.0 3.0 3.0 3.0 3.0 1.0 1.0 [19] 3.0 3.0 1.0 2.0 1.0 1.0 2.0 3.0 5.0 5.0 0.0 2.0 1.0 1.0 3.0 2.0 1.0 3.0 [37] 2.0 2.5 3.0 3.0 0.0 0.0 3.0 3.0 6.0 2.0 3.0 3.0 2.0 3.0 3.0 2.0 0.0 1.0 [55] 2.0 2.5 3.0 5.0 2.5 0.0 3.0 1.0 5.0 2.0 1.0 2.0 3.0 3.0 3.0 2.0 2.0 6.0 [73] 5.0 2.0 1.0 2.0 1.0 5.0 2.0 3.0 1.5 0.0 0.0 2.0 2.0 2.0 0.0 5.0 2.0 3.0 [91] 6.0 0.0 3.0 3.0 3.0 3.0 2.0 2.0 3.0 1.0
示例 5
> x5<-sample(c(NA,rpois(5,3)),100,replace=TRUE) > x5
输出
[1] 3 1 3 6 5 3 5 NA 5 5 3 1 3 1 3 NA 3 5 6 NA 3 3 5 5 3 [26] 5 NA 3 3 3 5 5 NA 5 6 3 1 3 1 3 3 5 NA 5 6 1 3 6 5 5 [51] 1 5 NA 5 NA 1 5 3 1 6 NA 5 1 5 NA NA 6 6 5 1 5 5 NA 3 5 [76] 5 5 5 1 5 NA NA 1 6 5 5 5 5 5 1 5 NA 1 NA 3 NA 3 6 5 1
示例
> na.approx(x5)
输出
[1] 3.000000 1.000000 3.000000 6.000000 5.000000 3.000000 5.000000 5.000000 [9] 5.000000 5.000000 3.000000 1.000000 3.000000 1.000000 3.000000 3.000000 [17] 3.000000 5.000000 6.000000 4.500000 3.000000 3.000000 5.000000 5.000000 [25] 3.000000 5.000000 4.000000 3.000000 3.000000 3.000000 5.000000 5.000000 [33] 5.000000 5.000000 6.000000 3.000000 1.000000 3.000000 1.000000 3.000000 [41] 3.000000 5.000000 5.000000 5.000000 6.000000 1.000000 3.000000 6.000000 [49] 5.000000 5.000000 1.000000 5.000000 5.000000 5.000000 3.000000 1.000000 [57] 5.000000 3.000000 1.000000 6.000000 5.500000 5.000000 1.000000 5.000000 [65] 5.333333 5.666667 6.000000 6.000000 5.000000 1.000000 5.000000 5.000000 [73] 4.000000 3.000000 5.000000 5.000000 5.000000 5.000000 1.000000 5.000000 [81] 3.666667 2.333333 1.000000 6.000000 5.000000 5.000000 5.000000 5.000000 [89] 5.000000 1.000000 5.000000 3.000000 1.000000 2.000000 3.000000 3.000000 [97] 3.000000 6.000000 5.000000 1.000000
广告