从文件路径中获取第一个元素
Get the first element from the file path
我在数据框中有向量
c("E:\\My Network Places.old.dat", "E:\\pagefile.sys", "E:\\Press_Dly_Diff_G_91.rbc",
"E:\\TV_Press_Dly_Diff_A\Retrospect\Dantz", "E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000083.rdb",
"E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000561.rdb"
)
在上面,文件路径的长度从 1 到 5 不等。我试图将文件路径的每个级别放入数据框中的一列中。我尝试使用以下内容处理第一部分:
library(stringr)
df1$PF <- strsplit(df1$File.Name, "\\"))
df1$PFolder <- df1$PF[[1]][3]
但我只得到数据框中所有行的第一个 My Network Places.old.dat
。如何根据 \
分隔符将路径拆分为多列并将其保存到数据框中的单独列中。所需的输出如下所示:
File.Name FilePath1 FilePath2 FilePath3
上述字符向量在数据框中File.Name
。
我们可以使用 splitstackshape
中的 cSplit
splitstackshape::cSplit(df, "path", "\\+", fixed = FALSE)
# path_1 path_2 path_3 path_4 path_5 path_6
#1: E: My Network Places.old.dat <NA> <NA> <NA> <NA>
#2: E: pagefile.sys <NA> <NA> <NA> <NA>
#3: E: Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
#4: E: TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
#5: E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
#6: E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
或者如果您已经知道数据将扩展多少列,我们也可以使用 separate
。
tidyr::separate(df, path, into = paste0('path', 1:6), sep = "\\+", fill = 'right')
数据
df <- data.frame(path = x, stringsAsFactors = FALSE)
使用分隔符导入 - "\"
:
library(data.table)
fread("
E:\\My Network Places.old.dat
E:\\pagefile.sys
E:\\Press_Dly_Diff_G_91.rbc
E:\\TV_Press_Dly_Diff_A\Retrospect\Dantz
E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000083.rdb
E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000561.rdb",
sep = "\", fill = TRUE, na.strings = "")
# V1 V2 V3 V4 V5 V6 V7
# 1: E: NA My Network Places.old.dat <NA> <NA> <NA> <NA>
# 2: E: NA pagefile.sys <NA> <NA> <NA> <NA>
# 3: E: NA Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
# 4: E: NA TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
# 5: E: NA TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
# 6: E: NA TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
这是使用 strsplit
的基本 R 解决方案,即
res <- data.frame(do.call(rbind,lapply(s <- strsplit(v,split = "\\+"),`length<-`,max(lengths(s)))))
这样
> res
X1 X2 X3 X4 X5 X6
1 E: My Network Places.old.dat <NA> <NA> <NA> <NA>
2 E: pagefile.sys <NA> <NA> <NA> <NA>
3 E: Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
4 E: TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
5 E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
6 E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
数据
v <- c("E:\\My Network Places.old.dat", "E:\\pagefile.sys", "E:\\Press_Dly_Diff_G_91.rbc",
"E:\\TV_Press_Dly_Diff_A\Retrospect\Dantz", "E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000083.rdb",
"E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000561.rdb"
)
我在数据框中有向量
c("E:\\My Network Places.old.dat", "E:\\pagefile.sys", "E:\\Press_Dly_Diff_G_91.rbc",
"E:\\TV_Press_Dly_Diff_A\Retrospect\Dantz", "E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000083.rdb",
"E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000561.rdb"
)
在上面,文件路径的长度从 1 到 5 不等。我试图将文件路径的每个级别放入数据框中的一列中。我尝试使用以下内容处理第一部分:
library(stringr)
df1$PF <- strsplit(df1$File.Name, "\\"))
df1$PFolder <- df1$PF[[1]][3]
但我只得到数据框中所有行的第一个 My Network Places.old.dat
。如何根据 \
分隔符将路径拆分为多列并将其保存到数据框中的单独列中。所需的输出如下所示:
File.Name FilePath1 FilePath2 FilePath3
上述字符向量在数据框中File.Name
。
我们可以使用 splitstackshape
cSplit
splitstackshape::cSplit(df, "path", "\\+", fixed = FALSE)
# path_1 path_2 path_3 path_4 path_5 path_6
#1: E: My Network Places.old.dat <NA> <NA> <NA> <NA>
#2: E: pagefile.sys <NA> <NA> <NA> <NA>
#3: E: Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
#4: E: TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
#5: E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
#6: E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
或者如果您已经知道数据将扩展多少列,我们也可以使用 separate
。
tidyr::separate(df, path, into = paste0('path', 1:6), sep = "\\+", fill = 'right')
数据
df <- data.frame(path = x, stringsAsFactors = FALSE)
使用分隔符导入 - "\"
:
library(data.table)
fread("
E:\\My Network Places.old.dat
E:\\pagefile.sys
E:\\Press_Dly_Diff_G_91.rbc
E:\\TV_Press_Dly_Diff_A\Retrospect\Dantz
E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000083.rdb
E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000561.rdb",
sep = "\", fill = TRUE, na.strings = "")
# V1 V2 V3 V4 V5 V6 V7
# 1: E: NA My Network Places.old.dat <NA> <NA> <NA> <NA>
# 2: E: NA pagefile.sys <NA> <NA> <NA> <NA>
# 3: E: NA Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
# 4: E: NA TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
# 5: E: NA TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
# 6: E: NA TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
这是使用 strsplit
的基本 R 解决方案,即
res <- data.frame(do.call(rbind,lapply(s <- strsplit(v,split = "\\+"),`length<-`,max(lengths(s)))))
这样
> res
X1 X2 X3 X4 X5 X6
1 E: My Network Places.old.dat <NA> <NA> <NA> <NA>
2 E: pagefile.sys <NA> <NA> <NA> <NA>
3 E: Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
4 E: TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
5 E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
6 E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
数据
v <- c("E:\\My Network Places.old.dat", "E:\\pagefile.sys", "E:\\Press_Dly_Diff_G_91.rbc",
"E:\\TV_Press_Dly_Diff_A\Retrospect\Dantz", "E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000083.rdb",
"E:\\TV_Press_Dly_Diff_A\Retrospect\TV_Press_Dly_Diff_A\1-TV_Press_Dly_Diff_A\AA000561.rdb"
)