基于部分匹配替换数据框中的值

substitute values in dataframe based on partial match

这是我的数据

> df1
        col1      col2
1  0/0:6:6,0 0/0:6:6,0
2  0/0:6:6,0 0/1:6:6,0
...
6  1/1:6:6,0 0/0:6:6,0
7  0/0:8:8,0 0/0:8:8,0

我想要的是将像“0/0:6:6,0”这样的长条目替换为 0,如果它们以“0/0”开头,则替换为 0.5,如果它们以“0/1”开头,等等。

到目前为止我试过这个:

1) replace-starts_with

df %>% mutate(col1 = replace(col1, starts_with("0/0"), 0)) %>% head()
    Error in mutate_impl(.data, dots) : 
      Evaluation error: Variable context not set.
    In addition: Warning message:
    In `[<-.factor`(`*tmp*`, list, value = 0) :
      invalid factor level, NA generated

2) grep(此处视为解决方案)

df[,1][grep("0/1",df[,1])]<-0.5
Warning message:
In `[<-.factor`(`*tmp*`, grep("0/1", df[, 1]), value = c(NA, 2L,  :
  invalid factor level, NA generated

感觉很迷茫...这是漫长的一天

我们可以用grepl

df1 %>%
   mutate(col1 = replace(col1, grepl("^0/0", col1), 0))
#       col1      col2
#1         0 0/0:6:6,0
#2         0 0/1:6:6,0
#3 1/1:6:6,0 0/0:6:6,0
#4         0 0/0:8:8,0

或使用 startsWith 来自 base R

df1 %>%
    mutate(col1 = replace(col1, startsWith(col1, "0/0"), 0))

dplyr::starts_with 的问题在于它是 select 基于变量名称的辅助函数

df1 %>%
    select(starts_with('col1'))
#       col1
#1 0/0:6:6,0
#2 0/0:6:6,0
#6 1/1:6:6,0
#7 0/0:8:8,0

而不是变量的值,而 startsWith returns 一个 logical 向量作为 grepl

startsWith(df1$col1, "0/0")
#[1]  TRUE  TRUE FALSE  TRUE