R：查找包含精确字符串匹配且仅后跟 _（忽略大小写）的文件

Question

我有文件叫

value<-c("ABC_Seed_1_0.csv", "ABC_Seed_1_1.csv", "ABC_Seed_10_0.csv", "ABC_Seed_10_1.csv")

我只想查找和删除属于以下文件的文件：seed_1.tar.xz（即查找所有名为 ACB_Seed_1_*.csv 的文件）

我遇到的问题是，如果我搜索 seed_1，我也会得到 seed_10。有什么技巧吗？

我尝试使用 paste0 添加“_”

    #Available files
    value<-c("ABC_Seed_1_0.csv", "ABC_Seed_1_1.csv", "ABC_Seed_10_0.csv", "ABC_Seed_10_1.csv")

    library(dplyr)
    library(tidyr)

    #File to match against (minus extension)
    file<-c("seed_10.tar.xz")

    ListToDelete<- value %>% 
    as_tibble %>% 
    filter(value, 
    stringr::str_detect(string = value, pattern = paste0(fixed(tools::file_path_sans_ext(file, compression = TRUE),ignore_case = TRUE),"_"))
    
    #Returns an empty tibble


    file.remove(ListToDelete)

Answer 1

您可能使这比需要的更复杂。在 base R 中，我会在这里使用 grepl：

value[grepl("ABC_Seed_1_\d+.csv", value)]
[1] "ABC_Seed_1_0.csv" "ABC_Seed_1_1.csv"

数据：

value <- c("ABC_Seed_1_0.csv",  "ABC_Seed_1_1.csv",
           "ABC_Seed_10_0.csv", "ABC_Seed_10_1.csv")

Answer 2

为了改进之前的答案...假设您的文件名是标准的，首先使用 strsplit 拆分输入并提取种子编号，然后按照建议使用 grepl。

例如

value[grepl(paste("Seed_",as.numeric(strsplit(file, "[_|.]")[[1]][2]),"_",sep=""), value, fixed=TRUE)]

R：查找包含精确字符串匹配且仅后跟 _（忽略大小写）的文件

R: Find files containing exact string match followed only by _ (ignore case)

r

stringr