如何在 R 数据框中将字符串转换为唯一的整数?


要将 R 数据框中的字符串转换为唯一的整数,我们首先需要提取数据框中的唯一字符串,然后使用 as.numeric 和 factor 函数在 data.frame 函数内读取它们。

查看下面的示例以了解其工作原理。

示例 1

考虑以下数据框 -

x1<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
x2<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
x3<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
df1<-data.frame(x1,x2,x3)
df1

创建了以下数据框

       x1      x2      x3
1     Hot  Normal     Hot
2     Hot     Hot  Normal
3     Hot  Normal     Hot
4     Hot    Cold  Normal
5     Hot    Cold     Hot
6     Hot     Hot    Cold
7     Hot    Cold    Cold
8  Normal    Cold    Cold
9     Hot  Normal    Cold
10    Hot     Hot     Hot
11 Normal  Normal     Hot
12 Normal  Normal  Normal
13    Hot     Hot    Cold
14 Normal    Cold    Cold
15    Hot     Hot     Hot
16   Cold     Hot  Normal
17    Hot     Hot     Hot
18    Hot     Hot    Cold
19    Hot    Cold    Cold
20   Cold     Hot     Hot

要在上面创建的数据框上提取数据框 df1 中的唯一值,请将以下代码添加到上面的代码段中 -

x1<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
x2<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
x3<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
df1<-data.frame(x1,x2,x3)
Unique_df1<-
unique(c(as.character(df1$x1),as.character(df1$x2),as.character(df1$x3)))
Unique_df1

输出

如果您将上面给出的所有代码段作为单个程序执行,它将生成以下输出 -

[1] "Hot" "Normal" "Cold"

要在上面创建的数据框上将 df1 中的字符串值转换为唯一数值,请将以下代码添加到上面的代码段中 -

x1<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
x2<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
x3<-sample(c("Hot","Normal","Cold"),20,replace=TRUE)
df1<-data.frame(x1,x2,x3)
Unique_df1<-
unique(c(as.character(df1$x1),as.character(df1$x2),as.character(df1$x3)))
df1<-
data.frame(x1=as.numeric(factor(df1$x1,levels=Unique_df1)),x2=as.numeric(factor
(df1$x2,levels=Unique_df1)),x3=as.numeric(factor(df1$x3,levels=Unique_df1)))
df1

输出

如果您将上面给出的所有代码段作为单个程序执行,它将生成以下输出 -

  x1 x2 x3
1  1 2 1
2  1 1 2
3  1 2 1
4  1 3 2
5  1 3 1
6  1 1 3
7  1 3 3
8  2 3 3
9  1 2 3
10 1 1 1
11 2 2 1
12 2 2 2
13 1 1 3
14 2 3 3
15 1 1 1
16 3 1 2
17 1 1 1
18 1 1 3
19 1 3 3
20 3 1 1

示例 2

以下代码段创建一个示例数据框 -

y1<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
y2<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
y3<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
df2<-data.frame(y1,y2,y3)
df2

创建了以下数据框

       y1     y2     y3
1   Rainy Winter  Rainy
2  Summer  Rainy Summer
3  Summer Spring Summer
4  Summer Spring Winter
5  Winter Winter  Rainy
6  Summer Rainy  Winter
7  Winter Winter  Rainy
8  Winter Summer Spring
9  Spring Summer Winter
10 Summer Summer Spring
11  Rainy  Rainy Spring
12  Rainy Winter Summer
13 Summer Spring Spring
14 Summer Summer Winter
15 Spring Spring Winter
16 Spring Spring Spring
17 Winter Spring Spring
18 Winter  Rainy Summer
19 Winter Spring Winter
20 Winter Summer Summer

要在上面创建的数据框上提取数据框 df2 中的唯一值,请将以下代码添加到上面的代码段中 -

y1<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
y2<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
y3<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
df2<-data.frame(y1,y2,y3)
Unique_df2<-
unique(c(as.character(df2$y1),as.character(df2$y2),as.character(df2$y3)))
Unique_df2

输出

如果您将上面给出的所有代码段作为单个程序执行,它将生成以下输出 -

[1] "Rainy" "Summer" "Winter" "Spring"

要在上面创建的数据框上将 df2 中的字符串值转换为唯一数值,请将以下代码添加到上面的代码段中 -

y1<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
y2<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
y3<-sample(c("Summer","Rainy","Winter","Spring"),20,replace=TRUE)
df2<-data.frame(y1,y2,y3)
Unique_df2<-
unique(c(as.character(df2$y1),as.character(df2$y2),as.character(df2$y3)))
df2<-
data.frame(y1=as.numeric(factor(df2$y1,levels=Unique_df2)),y2=as.numeric(factor
(df2$y2,levels=Unique_df2)),y3=as.numeric(factor(df2$y3,levels=Unique_df2)))
df2

输出

如果您将上面给出的所有代码段作为单个程序执行,它将生成以下输出 -

  y1 y2 y3
1  1 3 1
2  2 1 2
3  2 4 2
4  2 4 3
5  3 3 1
6  2 1 3
7  3 3 1
8  3 2 4
9  4 2 3
10 2 2 4
11 1 1 4
12 1 3 2
13 2 4 4
14 2 2 3
15 4 4 3
16 4 4 4
17 3 4 4
18 3 1 2
19 3 4 3
20 3 2 2

更新于: 2021-10-28

1K+ 浏览量

启动您的 职业生涯

通过完成课程获得认证

开始学习
广告