如何对 R 数据框中所有列执行夏皮罗检验?


夏皮罗检验用于测试变量是否正态分布,而该检验的原假设是变量呈正态分布。如果我们在 R 数据框中具有数值列,则我们可能需要检查所有变量的正态性。这可以通过 apply 函数和 shapiro.test 的帮助来实现,如同以下示例所示。

示例

 在线示例

考虑以下数据框 −

set.seed(321)
x1<−rnorm(20,2,0.34)
x2<−rpois(20,5)
x3<−rpois(20,2)
x4<−rpois(20,5)
x5<−rpois(20,6)
x6<−runif(20,1,5)
x7<−rexp(20,0.62)
x8<−rpois(20,10)
df<−data.frame(x1,x2,x3,x4,x5,x6,x7,x8)
df

输出

x1 x2 x3 x4 x5 x6 x7 x8
1 2.579667 7 0 2 4 4.712527 2.69354358 9
2 1.757907 4 0 3 3 1.519762 2.63275896 9
3 1.905485 5 2 5 4 3.087971 1.83827735 5
4 1.959319 7 0 10 14 3.564951 1.19092513 10
5 1.957853 7 3 5 5 4.576069 0.61126332 10
6 2.091182 4 0 4 10 3.316821 2.56506184 8
7 2.247126 3 4 5 7 1.636518 1.88751338 9
8 2.079266 8 4 7 7 3.018356 0.11237261 8
9 2.115299 3 2 7 4 4.516734 0.17862062 13
10 1.812349 3 0 6 5 3.009659 0.57255735 8
11 2.118218 5 2 6 4 1.025079 0.09536165 10
12 2.504761 4 1 3 4 1.936312 3.11482640 14
13 2.064031 1 0 5 7 2.388424 2.96859719 13
14 2.830708 2 4 9 6 3.779138 0.61244047 6
15 1.607831 6 5 7 7 2.740338 1.15703781 12
16 1.726412 6 3 5 7 4.690268 2.78394417 10
17 2.155064 3 2 8 11 4.043131 0.12627601 7
18 2.142913 3 4 8 4 1.481830 0.14825531 8
19 2.196379 4 2 3 6 1.490243 4.61761476 5
20 2.151761 6 1 5 2 1.914817 0.26060923 11

对 df 的所有列应用夏皮罗检验 −

示例

apply(df,2,shapiro.test)

输出

$x1
Shapiro-Wilk normality test
data: newX[, i]
W = 0.94053, p-value = 0.2453
$x2
Shapiro-Wilk normality test
data: newX[, i]
W = 0.95223, p-value = 0.4022
$x3
Shapiro-Wilk normality test
data: newX[, i]
W = 0.88855, p-value = 0.02529
$x4
Shapiro-Wilk normality test
data: newX[, i]
W = 0.96244, p-value = 0.5938
$x5
Shapiro-Wilk normality test
data: newX[, i]
W = 0.87904, p-value = 0.017
$x6
Shapiro-Wilk normality test
data: newX[, i]
W = 0.93067, p-value = 0.1591
$x7
Shapiro-Wilk normality test
data: newX[, i]
W = 0.88531, p-value = 0.02208
$x8
Shapiro-Wilk normality test
data: newX[, i]
W = 0.96271, p-value = 0.5992

更新日期:2021 年 2 月 10 日

5K+ 次浏览

开启 职业 生涯

完成课程并获得认证

开始
广告