基于部分匹配替换数据框中的值
substitute values in dataframe based on partial match
这是我的数据
> df1
col1 col2
1 0/0:6:6,0 0/0:6:6,0
2 0/0:6:6,0 0/1:6:6,0
...
6 1/1:6:6,0 0/0:6:6,0
7 0/0:8:8,0 0/0:8:8,0
我想要的是将像“0/0:6:6,0”这样的长条目替换为 0,如果它们以“0/0”开头,则替换为 0.5,如果它们以“0/1”开头,等等。
到目前为止我试过这个:
1) replace-starts_with
df %>% mutate(col1 = replace(col1, starts_with("0/0"), 0)) %>% head()
Error in mutate_impl(.data, dots) :
Evaluation error: Variable context not set.
In addition: Warning message:
In `[<-.factor`(`*tmp*`, list, value = 0) :
invalid factor level, NA generated
2) grep(此处视为解决方案)
df[,1][grep("0/1",df[,1])]<-0.5
Warning message:
In `[<-.factor`(`*tmp*`, grep("0/1", df[, 1]), value = c(NA, 2L, :
invalid factor level, NA generated
感觉很迷茫...这是漫长的一天
我们可以用grepl
df1 %>%
mutate(col1 = replace(col1, grepl("^0/0", col1), 0))
# col1 col2
#1 0 0/0:6:6,0
#2 0 0/1:6:6,0
#3 1/1:6:6,0 0/0:6:6,0
#4 0 0/0:8:8,0
或使用 startsWith
来自 base R
df1 %>%
mutate(col1 = replace(col1, startsWith(col1, "0/0"), 0))
dplyr::starts_with
的问题在于它是 select
基于变量名称的辅助函数
df1 %>%
select(starts_with('col1'))
# col1
#1 0/0:6:6,0
#2 0/0:6:6,0
#6 1/1:6:6,0
#7 0/0:8:8,0
而不是变量的值,而 startsWith
returns 一个 logical
向量作为 grepl
startsWith(df1$col1, "0/0")
#[1] TRUE TRUE FALSE TRUE
这是我的数据
> df1
col1 col2
1 0/0:6:6,0 0/0:6:6,0
2 0/0:6:6,0 0/1:6:6,0
...
6 1/1:6:6,0 0/0:6:6,0
7 0/0:8:8,0 0/0:8:8,0
我想要的是将像“0/0:6:6,0”这样的长条目替换为 0,如果它们以“0/0”开头,则替换为 0.5,如果它们以“0/1”开头,等等。
到目前为止我试过这个:
1) replace-starts_with
df %>% mutate(col1 = replace(col1, starts_with("0/0"), 0)) %>% head()
Error in mutate_impl(.data, dots) :
Evaluation error: Variable context not set.
In addition: Warning message:
In `[<-.factor`(`*tmp*`, list, value = 0) :
invalid factor level, NA generated
2) grep(此处视为解决方案)
df[,1][grep("0/1",df[,1])]<-0.5
Warning message:
In `[<-.factor`(`*tmp*`, grep("0/1", df[, 1]), value = c(NA, 2L, :
invalid factor level, NA generated
感觉很迷茫...这是漫长的一天
我们可以用grepl
df1 %>%
mutate(col1 = replace(col1, grepl("^0/0", col1), 0))
# col1 col2
#1 0 0/0:6:6,0
#2 0 0/1:6:6,0
#3 1/1:6:6,0 0/0:6:6,0
#4 0 0/0:8:8,0
或使用 startsWith
来自 base R
df1 %>%
mutate(col1 = replace(col1, startsWith(col1, "0/0"), 0))
dplyr::starts_with
的问题在于它是 select
基于变量名称的辅助函数
df1 %>%
select(starts_with('col1'))
# col1
#1 0/0:6:6,0
#2 0/0:6:6,0
#6 1/1:6:6,0
#7 0/0:8:8,0
而不是变量的值,而 startsWith
returns 一个 logical
向量作为 grepl
startsWith(df1$col1, "0/0")
#[1] TRUE TRUE FALSE TRUE