由于特定的正则表达式,如何使用 mutate(across()) 更改列?
How can I change columns with mutate(across()) due to a specific RegEx?
我对 mutate(across()) 函数有疑问。
在下面的标题中,我想删除列中的“字母+下划线”(例如“p__”、“c__”等)。
A tibble: 2,477 x 4
Phylum Class Order Family
<chr> <chr> <chr> <chr>
1 " p__Proteobacteria" " c__Gammaproteobacter~ " o__Aeromonadales" " f__Aeromonadaceae"
2 " p__Bacteroidota" " c__Bacteroidia" " o__Bacteroidales" " f__Williamwhitmaniac~
3 " p__Fusobacteriota" " c__Fusobacteriia" " o__Fusobacterial~ " f__Leptotrichiaceae"
4 " p__Firmicutes" " c__Clostridia" " o__Clostridiales" " f__Clostridiaceae"
5 " p__Proteobacteria" " c__Gammaproteobacter~ " o__Enterobactera~ " f__Enterobacteriacea~
6 " p__Bacteroidota" " c__Bacteroidia" " o__Bacteroidales" " f__Williamwhitmaniac~
7 " p__Firmicutes" " c__Clostridia" " o__Lachnospirale~ " f__Lachnospiraceae"
8 " p__Bacteroidota" " c__Bacteroidia" " o__Cytophagales" " f__Spirosomaceae"
9 " p__Proteobacteria" " c__Gammaproteobacter~ " o__Burkholderial~ " f__Comamonadaceae"
10 " p__Actinobacteriot~ " c__Actinobacteria" " o__Frankiales" " f__Sporichthyaceae"
# ... with 2,467 more rows
一年前我用过命令
table <- table %>%
mutate_at(vars(Phylum, Class, Order, Family),funs(sub(pattern = "^([a-z])(_{2})", replacement = "", .)))
现在,它提示我 funs-function 不再受支持并且不再有效。
你对我有什么建议吗?
我考虑过:
taxon <- c("Phylum", "Class", "Order", "Family")
table <- table %>%
mutate(across(taxon), gsub(pattern = "^([a-z])(_{2})", replacement = "", .))
但是我得到了错误:
Error: Invalid index: out of bounds
非常感谢:)
凯瑟琳
你可以这样做:
library(dplyr)
taxon <- c("Phylum", "Class", "Order", "Family")
table <- table %>% mutate(across(taxon,
~gsub(pattern = "^([a-z])(_{2})", replacement = "", .)))
我没有你的数据来证实这一点,但字符串开头似乎有一个空格,应该先将其删除。
table <- table %>% mutate(across(taxon,
~gsub(pattern = "^([a-z])(_{2})", replacement = "", trimws(.))))
我对 mutate(across()) 函数有疑问。 在下面的标题中,我想删除列中的“字母+下划线”(例如“p__”、“c__”等)。
A tibble: 2,477 x 4
Phylum Class Order Family
<chr> <chr> <chr> <chr>
1 " p__Proteobacteria" " c__Gammaproteobacter~ " o__Aeromonadales" " f__Aeromonadaceae"
2 " p__Bacteroidota" " c__Bacteroidia" " o__Bacteroidales" " f__Williamwhitmaniac~
3 " p__Fusobacteriota" " c__Fusobacteriia" " o__Fusobacterial~ " f__Leptotrichiaceae"
4 " p__Firmicutes" " c__Clostridia" " o__Clostridiales" " f__Clostridiaceae"
5 " p__Proteobacteria" " c__Gammaproteobacter~ " o__Enterobactera~ " f__Enterobacteriacea~
6 " p__Bacteroidota" " c__Bacteroidia" " o__Bacteroidales" " f__Williamwhitmaniac~
7 " p__Firmicutes" " c__Clostridia" " o__Lachnospirale~ " f__Lachnospiraceae"
8 " p__Bacteroidota" " c__Bacteroidia" " o__Cytophagales" " f__Spirosomaceae"
9 " p__Proteobacteria" " c__Gammaproteobacter~ " o__Burkholderial~ " f__Comamonadaceae"
10 " p__Actinobacteriot~ " c__Actinobacteria" " o__Frankiales" " f__Sporichthyaceae"
# ... with 2,467 more rows
一年前我用过命令
table <- table %>%
mutate_at(vars(Phylum, Class, Order, Family),funs(sub(pattern = "^([a-z])(_{2})", replacement = "", .)))
现在,它提示我 funs-function 不再受支持并且不再有效。 你对我有什么建议吗? 我考虑过:
taxon <- c("Phylum", "Class", "Order", "Family")
table <- table %>%
mutate(across(taxon), gsub(pattern = "^([a-z])(_{2})", replacement = "", .))
但是我得到了错误:
Error: Invalid index: out of bounds
非常感谢:) 凯瑟琳
你可以这样做:
library(dplyr)
taxon <- c("Phylum", "Class", "Order", "Family")
table <- table %>% mutate(across(taxon,
~gsub(pattern = "^([a-z])(_{2})", replacement = "", .)))
我没有你的数据来证实这一点,但字符串开头似乎有一个空格,应该先将其删除。
table <- table %>% mutate(across(taxon,
~gsub(pattern = "^([a-z])(_{2})", replacement = "", trimws(.))))