R - 如何 intersect() 并包含重复项?
R - How to intersect() and include duplicates?
我有以下要交叉的字符字段。这些应该是相等的。
> char.y[[892]]
[1] "E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e" "s"
> char.x[[892]]
[1] "E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e" "s"
> intersect(char.x[[892]], char.y[[892]])
[1] "E" "d" "w" "a" "r" "s" " " "L" "i" "f" "e" "c" "n"
>
预期结果:
"E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e"
使用 intersect
将 return 通用元素,但不会重复它们。例如,s
出现了 3 次,但只会出现一次。
如果您想查看相同的布局,例如删除非交叉值,您可以使用以下内容:
a <- c("E", "d", "w", "a", "r", "d", "s", " ", "L", "i", "f", "e", "s", "c", "i", "e", "n", "c", "e", "s")
b <- c("E", "d", "w", "a", "r", "d", "s", " ", "L", "i", "f", "e", "s", "c", "i", "e", "n", "c", "e", "s")
a[a %in% intersect(a, b)]
# [1] "E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e" "s"
这完全取决于您比较的向量(以及顺序),但这是否足够?
b <- a <- c('E', 'd', 'w', 'a', 'r', 'd', 's', '', 'L', 'i', 'f', 'e', 's', 'c', 'i', 'e', 'n', 'c', 'e')
c <- letters[sample(1:26,100, rep=T)]
a[is.element(a,b)]
# [1] "E" "d" "w" "a" "r" "d" "s" "" "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e"
a[is.element(a,c)]
# [1] "d" "w" "a" "r" "d" "s" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e"
我遇到了完全相同的问题,但没有找到解决方案,所以我创建了自己的小函数 "intersectdup":
intersectdup <- function(vektor1, vektor2) {
result <- c()
for (i in 1:length(vektor2)) {
if (is.element(vektor2[i], vektor1)){
result <- c(result, vektor2[i])
foundAt <- match(vektor2[i], vektor1)
vektor1 <- c(vektor1[1:foundAt-1], vektor1[foundAt+1:length(vektor1)])
}
}
return(result)
}
以 Clemens 为例,这是 c-based
结构中的一个简单函数:
intersectMe = function(x, y, duplicates=TRUE)
{
xyi = intersect(x,y);
if(!duplicates) { return (xyi); }
res = c();
for(xy in xyi)
{
y.xy = which(y == xy); ny.xy = length(y.xy);
x.xy = which(x == xy); nx.xy = length(x.xy);
min.xy = min(ny.xy, nx.xy);
res = c(res, rep(xy, min.xy) );
}
res;
}
vecsets
库也有帮助(使用 Eric 创建的示例)
vecsets::vintersect(a, b)
[1] "E" "d" "d" "w" "a" "r" "s" "s" "s" " " "L" "i" "i" "f" "e" "e" "e" "c" "c" "n"
我有以下要交叉的字符字段。这些应该是相等的。
> char.y[[892]]
[1] "E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e" "s"
> char.x[[892]]
[1] "E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e" "s"
> intersect(char.x[[892]], char.y[[892]])
[1] "E" "d" "w" "a" "r" "s" " " "L" "i" "f" "e" "c" "n"
>
预期结果:
"E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e"
使用 intersect
将 return 通用元素,但不会重复它们。例如,s
出现了 3 次,但只会出现一次。
如果您想查看相同的布局,例如删除非交叉值,您可以使用以下内容:
a <- c("E", "d", "w", "a", "r", "d", "s", " ", "L", "i", "f", "e", "s", "c", "i", "e", "n", "c", "e", "s")
b <- c("E", "d", "w", "a", "r", "d", "s", " ", "L", "i", "f", "e", "s", "c", "i", "e", "n", "c", "e", "s")
a[a %in% intersect(a, b)]
# [1] "E" "d" "w" "a" "r" "d" "s" " " "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e" "s"
这完全取决于您比较的向量(以及顺序),但这是否足够?
b <- a <- c('E', 'd', 'w', 'a', 'r', 'd', 's', '', 'L', 'i', 'f', 'e', 's', 'c', 'i', 'e', 'n', 'c', 'e')
c <- letters[sample(1:26,100, rep=T)]
a[is.element(a,b)]
# [1] "E" "d" "w" "a" "r" "d" "s" "" "L" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e"
a[is.element(a,c)]
# [1] "d" "w" "a" "r" "d" "s" "i" "f" "e" "s" "c" "i" "e" "n" "c" "e"
我遇到了完全相同的问题,但没有找到解决方案,所以我创建了自己的小函数 "intersectdup":
intersectdup <- function(vektor1, vektor2) {
result <- c()
for (i in 1:length(vektor2)) {
if (is.element(vektor2[i], vektor1)){
result <- c(result, vektor2[i])
foundAt <- match(vektor2[i], vektor1)
vektor1 <- c(vektor1[1:foundAt-1], vektor1[foundAt+1:length(vektor1)])
}
}
return(result)
}
以 Clemens 为例,这是 c-based
结构中的一个简单函数:
intersectMe = function(x, y, duplicates=TRUE)
{
xyi = intersect(x,y);
if(!duplicates) { return (xyi); }
res = c();
for(xy in xyi)
{
y.xy = which(y == xy); ny.xy = length(y.xy);
x.xy = which(x == xy); nx.xy = length(x.xy);
min.xy = min(ny.xy, nx.xy);
res = c(res, rep(xy, min.xy) );
}
res;
}
vecsets
库也有帮助(使用 Eric 创建的示例)
vecsets::vintersect(a, b)
[1] "E" "d" "d" "w" "a" "r" "s" "s" "s" " " "L" "i" "i" "f" "e" "e" "e" "c" "c" "n"