使用下划线分隔的字符串中的每个单词添加行

Question

我有一个很长的字符串向量，每个字符串都由五个子字符串组成，每个子字符串用下划线符号分隔：

例如，这是字符串向量中的两个元素：

"land_somewhat_crop_produce_b.tif"
"marine_something_fish_meat_a.tif"

我想创建一个由这些子字符串组成的数据框。

col1	col2	col3	col4	col5
land	somewhat	crop	produce	b
marine	something	fish	meat	a

如何使用正则表达式模式匹配提取每个下划线之间的每个子字符串并使用这些子字符串为每一行创建一个数据框？

Answer 1

data <- gsub(".tif","",data)
data.frame(do.call(rbind,strsplit(data,"_")))

给予，

      X1        X2   X3      X4 X5
1   land  somewhat crop produce  b
2 marine something fish    meat  a

数据：

data <- c("land_somewhat_crop_produce_b.tif","marine_something_fish_meat_a.tif")

Answer 2

您还可以这样做：

read.table(text = sub("\.tif$", "", data), sep = "_")

      V1        V2   V3      V4 V5
1   land  somewhat crop produce  b
2 marine something fish    meat  a

Answer 3

您可以使用 splitstackshape 中的 cSplit :

data <- data.frame(col = sub('\.tif$', '', data))
splitstackshape::cSplit(data, 'col', sep = '_')

#    col_1     col_2 col_3   col_4 col_5
#1:   land  somewhat  crop produce     b
#2: marine something  fish    meat     a

使用下划线分隔的字符串中的每个单词添加行

add row using each word in a string that is separated by underscore

regex

r

stringr

dplyr

tidyr