如何在 R 中从字符串向量中提取单词?

要从字符串向量中提取单词,我们可以使用 stringr 软件包的单词函数。例如,如果我们有一个名为 x 的向量,其中包含 100 个单词,那么可以使用该命令提取前 20 个单词:word(x,start=1,end=20,sep=fixed(" ")))。如果我们想从其他单词开始,那么起始值将相应改变。



x<-c("R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka), and partly a play on the name of the Bell Labs Language S.")


[1] "R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka), and partly a play on the name of the Bell Labs Language S."


word(x,start=1,end=5,sep=fixed(" "))


[1] "R is a programming language"


word(x,start=1,end=20,sep=fixed(" "))


[1] "R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross"


word(x,start=1,end=10,sep=fixed(" "))


[1] "R is a programming language and software environment for statistical"


word(x,start=1,end=15,sep=fixed(" "))


[1] "R is a programming language and software environment for statistical analysis, graphics representation and reporting."


word(x,start=1,end=50,sep=fixed(" "))


[1] "R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public"


word(x,start=11,end=20,sep=fixed(" "))


[1] "analysis, graphics representation and reporting. R was created by Ross"


word(x,start=51,end=60,sep=fixed(" "))


[1] "License, and pre-compiled binary versions are provided for various operating"


word(x,start=6,end=10,sep=fixed(" "))


[1] "and software environment for statistical"


word(x,start=11,end=60,sep=fixed(" "))


[1] "analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating"


word(x,start=1,end=90,sep=fixed(" "))


[1] "R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka),"


word(x,start=11,end=90,sep=fixed(" "))


[1] "analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka),"


word(x,start=21,end=90,sep=fixed(" "))


[1] "Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka),"


word(x,start=51,end=100,sep=fixed(" "))


[1] "License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac. This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka), and partly a play on the name of the Bell"

更新于: 2021 年 2 月 10 日

3K+ 次浏览


