R中两列之间的部分字符串匹配
Partial string matching between two columns in R
我正在尝试验证列表的电子邮件是否正确。我在想我可以在电子邮件和名称列之间进行部分字符串匹配,并且 return 是新列中的逻辑向量 (TRUE/FALSE)。
在下面的示例中,只有第 3 行和第 5 行具有正确的电子邮件,这些行的输出将为 'TRUE'。我尝试了以下方法,但没有用:
>for (i in Test$LastName) {
Test$Match <- agrepl(i, Test$Email, ignore.case = TRUE)
}
>Test$Email %in% Test$LastName
也欢迎任何其他建议。谢谢!
试试这样的东西?你快到了,只需要将 TRUE/FALSE 存储在一个向量中。我使用了 sapply,遍历行名并比较相应的列。在 sapply 中,结果存储在一个向量中,因此您可以将其用作 TRUE/FALSE:
test = data.frame(FirstName=c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName=c("Low","Rose","Lock","Porter","Sims"),
Email=c("T.Rose@gmail.com","A.Low@gmail.com","stacy.lock@gmail.com","beth.mccormick@gmail.com","k.sims@gmail.com"))
matches = sapply(1:nrow(test),function(i)agrepl(test$LastName[i],test$Email[i]))
test[matches,]
FirstName LastName Email
3 Stacey Lock stacy.lock@gmail.com
5 Kellie Sims k.sims@gmail.com
试试这个:
DF <- data.frame(FirstName = c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName = c("Low","Rose","Lock","Porter","Sims"),
Email = c("T.Rose@gmail.com","A.Low@gmail.com","stacy.lock@gmail.com","beth.mccormick@gmail.com","k.sims@gmail.com"))
library(dplyr)
DF %>%
rowwise() %>%
mutate(isMatch = grepl(LastName, Email, ignore.case = T))
输出:
FirstName LastName Email isMatch
<fct> <fct> <fct> <lgl>
1 Audrey Low T.Rose@gmail.com FALSE
2 Tammy Rose A.Low@gmail.com FALSE
3 Stacey Lock stacy.lock@gmail.com TRUE
4 Judson Porter beth.mccormick@gmail.com FALSE
5 Kellie Sims k.sims@gmail.com TRUE
基础 R 选项是使用 grepl
+ mapply
Test <- within(Test, Match <- mapply(grepl,paste(FirstNmae,LastName,sep = "|"),Email,ignore.case = TRUE))
这样
> Test
FirstNmae LastName Email Match
1 Audrey Low T.Rose@gmail.com FALSE
2 Tammy Rose A.Low@gmail.com FALSE
3 Stacey Lock stacy.lock@gmail.com TRUE
4 Judson Porter beth.mccormick@gmail.com FALSE
5 Kellie Sims k.sims@gmail.com TRUE
数据
Test <- data.frame(FirstNmae = c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName = c("Low","Rose","Lock","Porter","Sims"),
Email = c("T.Rose@gmail.com","A.Low@gmail.com","stacy.lock@gmail.com","beth.mccormick@gmail.com","k.sims@gmail.com"))
我正在尝试验证列表的电子邮件是否正确。我在想我可以在电子邮件和名称列之间进行部分字符串匹配,并且 return 是新列中的逻辑向量 (TRUE/FALSE)。
在下面的示例中,只有第 3 行和第 5 行具有正确的电子邮件,这些行的输出将为 'TRUE'。我尝试了以下方法,但没有用:
>for (i in Test$LastName) {
Test$Match <- agrepl(i, Test$Email, ignore.case = TRUE)
}
>Test$Email %in% Test$LastName
也欢迎任何其他建议。谢谢!
试试这样的东西?你快到了,只需要将 TRUE/FALSE 存储在一个向量中。我使用了 sapply,遍历行名并比较相应的列。在 sapply 中,结果存储在一个向量中,因此您可以将其用作 TRUE/FALSE:
test = data.frame(FirstName=c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName=c("Low","Rose","Lock","Porter","Sims"),
Email=c("T.Rose@gmail.com","A.Low@gmail.com","stacy.lock@gmail.com","beth.mccormick@gmail.com","k.sims@gmail.com"))
matches = sapply(1:nrow(test),function(i)agrepl(test$LastName[i],test$Email[i]))
test[matches,]
FirstName LastName Email
3 Stacey Lock stacy.lock@gmail.com
5 Kellie Sims k.sims@gmail.com
试试这个:
DF <- data.frame(FirstName = c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName = c("Low","Rose","Lock","Porter","Sims"),
Email = c("T.Rose@gmail.com","A.Low@gmail.com","stacy.lock@gmail.com","beth.mccormick@gmail.com","k.sims@gmail.com"))
library(dplyr)
DF %>%
rowwise() %>%
mutate(isMatch = grepl(LastName, Email, ignore.case = T))
输出:
FirstName LastName Email isMatch
<fct> <fct> <fct> <lgl>
1 Audrey Low T.Rose@gmail.com FALSE
2 Tammy Rose A.Low@gmail.com FALSE
3 Stacey Lock stacy.lock@gmail.com TRUE
4 Judson Porter beth.mccormick@gmail.com FALSE
5 Kellie Sims k.sims@gmail.com TRUE
基础 R 选项是使用 grepl
+ mapply
Test <- within(Test, Match <- mapply(grepl,paste(FirstNmae,LastName,sep = "|"),Email,ignore.case = TRUE))
这样
> Test
FirstNmae LastName Email Match
1 Audrey Low T.Rose@gmail.com FALSE
2 Tammy Rose A.Low@gmail.com FALSE
3 Stacey Lock stacy.lock@gmail.com TRUE
4 Judson Porter beth.mccormick@gmail.com FALSE
5 Kellie Sims k.sims@gmail.com TRUE
数据
Test <- data.frame(FirstNmae = c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName = c("Low","Rose","Lock","Porter","Sims"),
Email = c("T.Rose@gmail.com","A.Low@gmail.com","stacy.lock@gmail.com","beth.mccormick@gmail.com","k.sims@gmail.com"))