过滤 R 中以任何拉丁字母开头的所有行

Question

如何过滤 R 中以任何拉丁字母开头的所有行

无效的示例代码

library(dplyr)

df <- data.frame( marks = c(20.1, 30.2, 40.3, 50.4, 60.5),
                  
                  age = c(21:25),
                  
                  roles = c('Software Eng.', 'Software Dev', 
                            'Data Analyst', 'Data Eng.',
                            '5Sigma'))

df %>% filter(grep("[A-z]", roles))

期望的输出

  marks age         roles
1  20.1  21 Software Eng.
2  30.2  22  Software Dev
3  40.3  23  Data Analyst
4  50.4  24     Data Eng.

Answer 1

首先，[A-z]和[A-Za-z]是不一样的，需要注意字符类。（参见 Difference between regex [A-z] and [a-zA-Z] and ignore the java 部分。）

其次，field:从何而来？这样做：

df %>%
  filter(grepl("^[A-Za-z]", roles))
#   marks age         roles
# 1  20.1  21 Software Eng.
# 2  30.2  22  Software Dev
# 3  40.3  23  Data Analyst
# 4  50.4  24     Data Eng.

（加上之前关于 grepl 与 grep 的评论。）

过滤 R 中以任何拉丁字母开头的所有行

Filter all rows that start with any Latin alphabetic letter in R

r

stringr

dplyr