但是如何匹配逗号的组合呢？我很迷惑

Question

我想问一个问题。如果您愿意尝试，这将非常有帮助。谢谢我这里有一个向量....

[1] "I1,I2" "I1,I3" "I1,I4" "I1,I5" "I2,I3" "I2,I4"
[7] "I2,I5" "I3,I4" "I3,I5" "I4,I5"

然后，我想通过下面的向量匹配它...

[1] "I1,I2,I5"    "I2,I4"       "I2,I3"      
[4] "I1,I2,I4"    "I1,I3"       "I2,I3"      
[7] "I1,I3"       "I1,I2,I3,I5" "I1,I2,I3"  


hits <- sapply(1:length(a.new.list), function(j) pmatch(result,a.new.list[j]))
colnames(hits) <- a.new.list
rownames(hits) <- result

apply(hits,1, sum,na.rm=TRUE)

I1,I2 I1,I3 I1,I4 I1,I5 I2,I3 I2,I4 I2,I5 I3,I4 I3,I5 I4,I5 
4     2     0     0     2     1     0     0     0     0

但是结果不是我所期望的。

I1,I2 I1,I3 I1,I4 I1,I5 I2,I3 I2,I4 I2,I5 I3,I4 I3,I5 I4,I5 
4     4     1     2     4     1     2     0     1     0

如果组合不在一个旁边，代码表示它不匹配... 但这不是我需要的。

感谢您的帮助。此致

Answer 1

此方法使用了 reshape2 中使用的 melt.list 方法。从字符串拆分创建两个数据帧后，我们合并字符串并检查匹配组的数量。该代码是为搜索对量身定制的。如果长度发生变化，则必须在 len:

处进行更改

library(reshape2)
len <- 2
dfs <- lapply(list(result, a.new.list), 
               function(x) melt(strsplit(x, ",")))
m <- merge(dfs[[2]], dfs[[1]], by=1)
f <- function(n) sum(aggregate(value~L1.y, m[m$L1.x == n,], 
               function(x) length(unique(x)) == len )$value)
setNames(sapply(1:length(a.new.list), f), a.new.list)
#I1,I2 I1,I3 I1,I4 I1,I5 I2,I3 I2,I4 I2,I5 I3,I4 I3,I5 I4,I5 
#    4     4     1     2     4     2     2     0     1     0

数据

a.new.list <- scan(what="character", text='"I1,I2" "I1,I3" "I1,I4" "I1,I5" "I2,I3" "I2,I4" "I2,I5" "I3,I4" "I3,I5" "I4,I5"')
result <- scan(what="character", text=' "I1,I2,I5"    "I2,I4"       "I2,I3"      
 "I1,I2,I4"    "I1,I3"       "I2,I3"      
                "I1,I3"       "I1,I2,I3,I5" "I1,I2,I3"  ')

但是如何匹配逗号的组合呢？我很迷惑

Matching the combination but how the comma? I am confused

r

match

apriori

sapply