如何在R的data.table对象中删除重复的列名?
在数据分析中,我们有时会处理重复数据或名称相同的数据表示。这种情况之一是data.table对象中的两列列名相同。为此,我们可以结合使用which函数和duplicated函数,并将重复项的输出设置为NULL以删除重复的列名。
示例1
加载data.table包并创建一个data.table对象:
library(data.table) x1<−rnorm(20) DT1<−data.table(x1,x1) DT1
输出
x1 x1 1: −1.65034927 −1.65034927 2: −1.95441645 −1.95441645 3: 2.03530252 2.03530252 4: −2.07789754 −2.07789754 5: −1.31558491 −1.31558491 6: 0.69256432 0.69256432 7: 1.83924420 1.83924420 8: −1.59751233 −1.59751233 9: −0.12015454 −0.12015454 10: 0.46507856 0.46507856 11: 1.00867249 1.00867249 12: 1.76181383 1.76181383 13: 0.35151845 0.35151845 14: −0.29470885 −0.29470885 15: −0.01617467 −0.01617467 16: 1.28775955 1.28775955 17: −1.80266832 −1.80266832 18: −0.70682196 −0.70682196 19: −2.07815278 −2.07815278 20: 0.43574626 0.43574626
从DT1中删除一个x1:
DT1[,which(duplicated(names(DT1))):=NULL] DT1
输出
x1 1: −1.65034927 2: −1.95441645 3: 2.03530252 4: −2.07789754 5: −1.31558491 6: 0.69256432 7: 1.83924420 8: −1.59751233 9: −0.12015454 10: 0.46507856 11: 1.00867249 12: 1.76181383 13: 0.35151845 14: −0.29470885 15: −0.01617467 16: 1.28775955 17: −1.80266832 18: −0.70682196 19: −2.07815278 20: 0.43574626
示例2
y1<−rpois(20,5) DT2<−data.table(y1,y1) DT2
输出
y1 y1 1: 6 6 2: 5 5 3: 7 7 4: 5 5 5: 2 2 6: 3 3 7: 6 6 8: 5 5 9: 5 5 10: 3 3 11: 3 3 12: 4 4 13: 3 3 14: 6 6 15: 4 4 16: 5 5 17: 4 4 18: 3 3 19: 1 1 20: 2 2
从DT2中删除一个y1:
DT2[,which(duplicated(names(DT2))):=NULL] DT2
输出
y1 1: 6 2: 5 3: 7 4: 5 5: 2 6: 3 7: 6 8: 5 9: 5 10: 3 11: 3 12: 4 13: 3 14: 6 15: 4 16: 5 17: 4 18: 3 19: 1 20: 2
广告