除了grep还有其他选择吗？

Question

我有两个data frames。其中一个包含问题的数量 作为文本 ，我使用 grep() 函数将这些数字与我的其他 dataframe 列的名称相匹配。

问题是我的 code 的一部分不起作用，因为我的 function grep() 没有发挥作用。

基本上我的两个dataframes如下

DF1:

Question	Group
11	Redmeat
100	Chicken
56	Vegetables
210	Dairy

DF 2（值无关紧要，重要的是列名）：

1.Question	2.Question	...	101.Question	...	250.Question
Yes	No	...	...	...	...
Yes	Yes	...	...	...	...
No	Yes	...	...	...	...
No	Yes	...	...	...	...

我使用以下代码：

i <- n ## I change n according to the row of DF1 that I want
grep(DF1$Question[i], colnames(DF2), fixed = T)

如果我这样做：

i <- 2  ## (Question number 100)
grep(DF1$Question[i], colnames(DF2), fixed = T)

我的代码 returns 100，这是正确的，因为它是对应于“100.Question”

的列

但如果我这样做：

i <- 1  ## (Question number 1)
grep(DF1$Question[i], colnames(DF2), fixed = T)

我的代码 returns 1, 11, 21 ... 101 ... 201

如果我这样做也一样：

i <- 3  ## (Question number 56)
grep(DF1$Question[i], colnames(DF2), fixed = T)

它 returns 56, 156

我只想要完全相同的号码。即使我使用参数 fixed = TRUE 它也不起作用。

是否有解决方案或替代方案？

Answer 1

您需要 grep 以获得唯一值，因此您应该 grep 字符串的开头 ^，以及您的号码和点 .。在这种情况下，您不能使用 fixed = T 参数，因为您使用的是正则表达式来匹配。

grep(paste0("^", DF1$Question[i], "\."), colnames(DF2))

Answer 2

两个选项：1) 在 grep 模式中包含 .，grep(paste0("^", DF1$Question[i], "\."), colnames(DF2))，或 2) 粘贴完整的 ".Question" 并使用完全匹配，完全不使用任何 grep： paste0(DF1$Question, ".Question")。这可能比正则表达式更有效。由于您的代码到处都是这些 i，我假设您使用的是循环。 grep 和 paste 是矢量化的，因此如果您提供更多上下文，我们也许能够帮助您完全避免循环。

Answer 3

如何在 pattern 中指定您想要从头开始 ^ 并且希望其后跟 .Q？

i=3
grep(paste0("^",DF1$Question[i],".Q"), colnames(DF2))

输出：

[1] 56

除了grep还有其他选择吗？

Is there another alternative to grep?

grep

r

match

dataframe

1.Question	2.Question	...	101.Question	...	250.Question
Yes	No	...	...	...	...
Yes	Yes	...	...	...	...
No	Yes	...	...	...	...
No	Yes	...	...	...	...

1.Question	2.Question	...	101.Question	...	250.Question
Yes	No	...	...	...	...
Yes	Yes	...	...	...	...
No	Yes	...	...	...	...
No	Yes	...	...	...	...

1.Question	2.Question	...	101.Question	...	250.Question
Yes	No	...	...	...	...
Yes	Yes	...	...	...	...
No	Yes	...	...	...	...
No	Yes	...	...	...	...