如何在R中将数值列转换为因子列?
通常,我们发现表示因子水平的值被记录为数值,因此,我们需要将这些数值转换为因子。这样,我们才能在分析中正确使用因子列,否则R程序会将因子视为数值,分析结果将不正确。
示例
data(mtcars) str(mtcars)
输出
'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... $ disp: num 160 160 108 258 360 ... $ hp : num 110 110 93 110 175 105 245 62 95 123 ... $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... $ wt : num 2.62 2.88 2.32 3.21 3.44 ... $ qsec: num 16.5 17 18.6 19.4 17 ... $ vs : num 0 0 1 1 0 1 0 1 1 1 ... $ am : num 1 1 1 0 0 0 0 0 0 0 ... $ gear: num 4 4 4 3 3 3 3 4 4 4 ... $ carb: num 4 4 1 1 2 1 4 2 2 4 ... mtcars$cyl<-as.factor(mtcars$cyl) str(mtcars) 'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ... $ disp: num 160 160 108 258 360 ... $ hp : num 110 110 93 110 175 105 245 62 95 123 ... $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... $ wt : num 2.62 2.88 2.32 3.21 3.44 ... $ qsec: num 16.5 17 18.6 19.4 17 ... $ vs : num 0 0 1 1 0 1 0 1 1 1 ... $ am : num 1 1 1 0 0 0 0 0 0 0 ... $ gear: num 4 4 4 3 3 3 3 4 4 4 ... $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
示例
data(ToothGrowth) str(ToothGrowth)
输出
'data.frame': 60 obs. of 3 variables: $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ... $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ... $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
示例
head(ToothGrowth,20)
输出
len supp dose 1 4.2 VC 0.5 2 11.5 VC 0.5 3 7.3 VC 0.5 4 5.8 VC 0.5 5 6.4 VC 0.5 6 10.0 VC 0.5 7 11.2 VC 0.5 8 11.2 VC 0.5 9 5.2 VC 0.5 10 7.0 VC 0.5 11 16.5 VC 1.0 12 16.5 VC 1.0 13 15.2 VC 1.0 14 17.3 VC 1.0 15 22.5 VC 1.0 16 17.3 VC 1.0 17 13.6 VC 1.0 18 14.5 VC 1.0 19 18.8 VC 1.0 20 15.5 VC 1.0
ToothGrowth$dose<-as.factor(ToothGrowth$dose) str(ToothGrowth) 'data.frame': 60 obs. of 3 variables: $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ... $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ... $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
示例
data(morley) str(morley)
输出
'data.frame': 100 obs. of 3 variables: $ Expt : int 1 1 1 1 1 1 1 1 1 1 ... $ Run : int 1 2 3 4 5 6 7 8 9 10 ... $ Speed: int 850 740 900 1070 930 850 950 980 980 880 ...
示例
morley$Expt<-as.factor(morley$Expt) str(morley)
输出
'data.frame': 100 obs. of 3 variables: $ Expt : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... $ Run : int 1 2 3 4 5 6 7 8 9 10 ... $ Speed: int 850 740 900 1070 930 850 950 980 980 880 ...
示例
x1<-sample(1:4,20,replace=TRUE) x2<-rnorm(20,2,3) df<-data.frame(x1,x2) str(df)
输出
'data.frame':20 obs. of 2 variables: $ x1: int 1 4 1 2 3 1 4 1 4 2 ... $ x2: num 1.56 1.64 2.83 2.2 4.23 ...
示例
df$x1<-as.factor(df$x1) str(df)
输出
'data.frame': 20 obs. of 2 variables: $ x1: Factor w/ 4 levels "1","2","3","4": 4 3 2 2 4 4 1 2 4 2 ... $ x2: num 3.82 1.13 2.99 5.8 3.3 ...
示例
data(BOD) str(BOD) '
输出
data.frame': 6 obs. of 2 variables: $ Time : num 1 2 3 4 5 7 $ demand: num 8.3 10.3 19 16 15.6 19.8 - attr(*, "reference")= chr "A1.4, p. 270"
示例
BOD$Time<-as.factor(BOD$Time) str(BOD)
输出
'data.frame': 6 obs. of 2 variables: $ Time : Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6 $ demand: num 8.3 10.3 19 16 15.6 19.8 - attr(*, "reference")= chr "A1.4, p. 270"
广告