如何在 R 中将列中带有逗号分隔值的 Data Frame 转换为多列?
通常,我们需要将数据从外部来源导入 R 环境进行分析,这些数据可以记录为表示行的逗号分隔值。如果我们想为逗号分隔值创建列,则可以使用 splitstackshape 包的 cSplit 函数。在下面的示例中,我们创建了一个带有逗号分隔值的 Data Frame,然后将这些值拆分为每列中的单个值。
考虑以下 Data Frame:
示例
df=data.frame(x=apply(matrix(rpois(200,10),20,10),1,paste,collapse=",")) df
输出
x 1 8,12,7,12,10,8,11,6,8,7 2 9,13,14,12,9,10,11,15,7,8 3 6,15,12,11,15,6,9,8,12,7 4 13,9,18,10,9,7,7,7,4,9 5 8,5,14,7,8,9,11,7,7,6 6 10,9,7,11,12,14,11,9,8,7 7 12,9,11,15,13,8,10,11,17,6 8 10,9,9,10,8,12,5,7,10,13 9 10,14,8,8,11,6,4,11,12,8 10 13,10,15,13,8,7,11,8,11,12 11 9,13,17,6,6,9,12,16,13,11 12 14,12,11,7,15,8,14,14,9,8 13 12,23,10,7,8,9,7,14,11,7 14 8,10,12,9,10,10,7,9,13,12 15 7,14,9,10,10,11,10,9,11,6 16 7,11,8,8,6,9,5,11,4,7 17 13,9,14,13,9,11,7,9,4,12 18 6,7,7,10,4,9,9,16,4,9 19 13,11,6,8,9,11,7,14,11,10 20 10,7,13,10,11,7,10,11,13,10
加载 splitstackshape 包并拆分逗号分隔值:
示例
library(splitstackshape) cSplit(df,"x",",")
输出
x_01 x_02 x_03 x_04 x_05 x_06 x_07 x_08 x_09 x_10 1: 9 15 16 11 9 8 7 12 7 11 2: 17 13 7 20 8 12 10 17 5 13 3: 8 13 12 7 9 10 7 7 7 5 4: 11 15 13 15 10 6 8 11 9 9 5: 6 11 8 10 14 11 8 15 10 9 6: 8 5 10 9 10 12 7 9 6 10 7: 7 11 10 11 11 12 17 13 11 11 8: 9 7 9 3 11 7 10 10 12 9 9: 9 10 15 9 16 13 14 12 4 10 10: 7 18 10 8 11 10 7 8 17 13 11: 8 11 7 11 15 10 12 7 13 9 12: 10 5 15 8 9 7 5 7 15 11 13: 11 11 13 4 13 11 6 7 3 11 14: 12 7 14 13 12 9 7 9 11 10 15: 7 11 15 14 8 9 3 10 14 9 16: 13 13 5 9 8 12 9 13 9 5 17: 10 12 16 10 7 10 7 17 15 11 18: 14 10 13 11 8 12 12 6 9 6 19: 4 3 5 8 11 11 13 9 10 4 20: 10 10 10 12 10 4 18 12 8 6
让我们看另一个例子:
示例
Rate=data.frame(x=apply(matrix(rexp(120,3.5),20,3),1,paste,collapse=",")) Rate
输出
x 1 0.0159381624045117,0.0144605446992187,0.12841108095433 2 0.54651167305141,0.167623156548611,0.242777586503458 3 0.178547318625663,0.291459725891611,0.00192235474549493 4 0.0913145632616111,0.964466593047224,0.101487736882908 5 0.0518861563344087,0.0108454372741601,0.0576572126598852 6 0.0378260237297841,0.736165504866593,0.314890729401991 7 0.0135630246784006,0.0276414981900331,0.0129506557381579 8 0.066366289742291,0.234642320167511,0.845350959423283 9 0.192795767315796,0.203176257587675,0.339430415874007 10 0.143820086227996,0.823251408256266,0.839730198087015 11 0.154881650581956,0.035932720736067,0.384848597314446 12 0.141309956620846,0.0586299393326044,0.191098268676017 13 0.224339416612165,0.166978088340589,0.244316381757197 14 0.00643399850066219,0.225754089641489,0.0545826112585408 15 0.159973739912467,0.0245804649550754,0.0734476327725455 16 0.047296607414733,0.68713491991819,0.230530212222171 17 0.0110702448125396,0.823835745405824,0.159328434749373 18 0.0129699476861528,0.215176185314914,0.132213854183943 19 0.288309599660042,0.408196979669972,0.234206797809179 20 0.201607588932688,0.0768561341932842,1.41488279179445
拆分逗号分隔值:
示例
cSplit(Rate,"x",",")
输出
x_1 x_2 x_3 1: 0.715195454 0.13364483 0.23434234 2: 0.287033157 0.15786313 1.25176852 3: 0.058617612 0.20123608 0.04697261 4: 0.017659639 0.59218378 0.12339740 5: 0.140093290 0.01579469 0.60002405 6: 0.056465154 0.24717511 0.81812581 7: 0.388322346 0.00329370 0.23367574 8: 0.140153797 0.07190267 0.12958013 9: 0.651111966 0.17373439 0.17571569 10: 0.452951349 0.33638535 0.34905380 11: 0.008617901 0.34899487 0.33835583 12: 0.107828330 0.12259176 0.12120932 13: 0.396324511 0.11236437 0.03790748 14: 0.063644511 0.38064627 0.20201243 15: 0.024197156 0.23522396 0.29972512 16: 0.013202143 1.31472758 0.40429690 17: 0.017637878 0.11423790 0.36857345 18: 0.113290019 0.18345270 0.34162356 19: 0.063014545 0.36422309 0.05693683 20: 0.075592492 0.18737919 0.04674416
广告