R：获取具有特定字符的数据框行

Question

我需要检测包含特定字符序列的 df/tibble 行。

seq <- "RT @AventusSystems" 是我的序列

df <- structure(list(text = c("@AventusSystems Wow, what a upgrade from help of investor", 
"RT @AventusSystems: A recent article about our investors as shown in Forbes! t.co/n8oGwiEDpu #Aventus #GlobalAdvisors #4thefans #Ti…", 
"@AventusSystems Very nice to have this project", "RT @AventusSystems: Join the #TicketRevolution with #Aventus today! #Aventus #TicketRevolution #AventCoin #4thefans t.co/OPlyCFmW4a"
), Tweet_Id = c("898359464444559360", "898359342952439809", "898359326552633345", 
"898359268226736128"), created_at = structure(c(17396, 17396, 
17396, 17396), class = "Date")), .Names = c("text", "Tweet_Id", 
"created_at"), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame"))

select(df, contains(seq))
# A tibble: 4 x 0

sapply(df$text, grepl, seq) return 只有 4 个错误

我做错了什么？什么是正确的解决方案？谢谢你的帮助

Answer 1

首先，grepl 已经对其参数 x 进行了向量化，因此您不需要 sapply。你可以做 grepl(seq, df$text)。

为什么你的代码不起作用是因为 sapply 将 X 参数的每个元素作为第一个参数传递给 FUN 参数中的函数（所以你正在寻找搜索模式“ @AventusSystems 哇，在你的 seq 对象中，投资者的帮助带来了多么大的升级啊。

最后，dplyr::select 选择列，而您想使用 dplyr::filter 过滤行。

R：获取具有特定字符的数据框行

R: get dataframe row with specific characters

string

r

tibble