过滤 R 中以任何拉丁字母开头的所有行
Filter all rows that start with any Latin alphabetic letter in R
如何过滤 R 中以任何拉丁字母开头的所有行
无效的示例代码
library(dplyr)
df <- data.frame( marks = c(20.1, 30.2, 40.3, 50.4, 60.5),
age = c(21:25),
roles = c('Software Eng.', 'Software Dev',
'Data Analyst', 'Data Eng.',
'5Sigma'))
df %>% filter(grep("[A-z]", roles))
期望的输出
marks age roles
1 20.1 21 Software Eng.
2 30.2 22 Software Dev
3 40.3 23 Data Analyst
4 50.4 24 Data Eng.
首先,[A-z]
和[A-Za-z]
是不一样的,需要注意字符类。 (参见 Difference between regex [A-z] and [a-zA-Z] and ignore the java 部分。)
其次,field:
从何而来?这样做:
df %>%
filter(grepl("^[A-Za-z]", roles))
# marks age roles
# 1 20.1 21 Software Eng.
# 2 30.2 22 Software Dev
# 3 40.3 23 Data Analyst
# 4 50.4 24 Data Eng.
(加上之前关于 grepl
与 grep
的评论。)
如何过滤 R 中以任何拉丁字母开头的所有行
无效的示例代码
library(dplyr)
df <- data.frame( marks = c(20.1, 30.2, 40.3, 50.4, 60.5),
age = c(21:25),
roles = c('Software Eng.', 'Software Dev',
'Data Analyst', 'Data Eng.',
'5Sigma'))
df %>% filter(grep("[A-z]", roles))
期望的输出
marks age roles
1 20.1 21 Software Eng.
2 30.2 22 Software Dev
3 40.3 23 Data Analyst
4 50.4 24 Data Eng.
首先,[A-z]
和[A-Za-z]
是不一样的,需要注意字符类。 (参见 Difference between regex [A-z] and [a-zA-Z] and ignore the java 部分。)
其次,field:
从何而来?这样做:
df %>%
filter(grepl("^[A-Za-z]", roles))
# marks age roles
# 1 20.1 21 Software Eng.
# 2 30.2 22 Software Dev
# 3 40.3 23 Data Analyst
# 4 50.4 24 Data Eng.
(加上之前关于 grepl
与 grep
的评论。)