使用具有不同列名的 stringdist_join
Using stringdist_join with differing column names
我有如下示例数据:
library(fuzzyjoin)
a <- data.frame(x = c("season", "season", "season", "package", "package"), y = c("1","2", "3", "1","6"))
b <- data.frame(x = c("season", "seson", "seson", "package", "pakkage"), w = c("1","2", "3", "2","6"))
c <- data.frame(z = c("season", "seson", "seson", "package", "pakkage"), w = c("1","2", "3", "2","6"))
因此以下运行正常:
d <- stringdist_left_join(a,b, by = "x", max_dist = 2)
但不允许与具有不同名称的列合并(请注意现在连接是 a
和 c
)。
e <- stringdist_left_join(a,c, by = c("x", "z"), max_dist = 2)
我想告诉stringdist_left_join
使用两个不同的列名来连接,就像最后一行代码(e)
,但它似乎不接受。
是否有任何解决方案(除了复制该列并为其重新命名)?
您可以对两个不同的列名称使用 =
。您可以使用以下代码:
e <- stringdist_left_join(a,c, by = c("x" = "z"), max_dist = 2)
输出:
x y z w
1 season 1 season 1
2 season 1 seson 2
3 season 1 seson 3
4 season 2 season 1
5 season 2 seson 2
6 season 2 seson 3
7 season 3 season 1
8 season 3 seson 2
9 season 3 seson 3
10 package 1 package 2
11 package 1 pakkage 6
12 package 6 package 2
13 package 6 pakkage 6
我有如下示例数据:
library(fuzzyjoin)
a <- data.frame(x = c("season", "season", "season", "package", "package"), y = c("1","2", "3", "1","6"))
b <- data.frame(x = c("season", "seson", "seson", "package", "pakkage"), w = c("1","2", "3", "2","6"))
c <- data.frame(z = c("season", "seson", "seson", "package", "pakkage"), w = c("1","2", "3", "2","6"))
因此以下运行正常:
d <- stringdist_left_join(a,b, by = "x", max_dist = 2)
但不允许与具有不同名称的列合并(请注意现在连接是 a
和 c
)。
e <- stringdist_left_join(a,c, by = c("x", "z"), max_dist = 2)
我想告诉stringdist_left_join
使用两个不同的列名来连接,就像最后一行代码(e)
,但它似乎不接受。
是否有任何解决方案(除了复制该列并为其重新命名)?
您可以对两个不同的列名称使用 =
。您可以使用以下代码:
e <- stringdist_left_join(a,c, by = c("x" = "z"), max_dist = 2)
输出:
x y z w
1 season 1 season 1
2 season 1 seson 2
3 season 1 seson 3
4 season 2 season 1
5 season 2 seson 2
6 season 2 seson 3
7 season 3 season 1
8 season 3 seson 2
9 season 3 seson 3
10 package 1 package 2
11 package 1 pakkage 6
12 package 6 package 2
13 package 6 pakkage 6