匹配 data.table/data.frame 与部分匹配的矩阵
Match data.table/data.frame with matrix that partially matches
我正在尝试合并以下 data.table
:
DE <- structure(list(date1 = c("2000", "2001", "2003"), country = c("DE",
"DE", "DE"), value = c(10, 20, 30)), row.names = c(NA, -3L), class = c("data.table",
"data.frame"))
date1 country value
1: 2000 DE 10
2: 2001 DE 20
3: 2003 DE 30
我想将其与带 0 的矩阵合并:
df <- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = 6:5, .Dimnames = list(
c("2000", "2001", "2002", "2003", "2004", "2005"), c("UK",
"DE", "FR", "SP", "IT")))
UK DE FR SP IT
2000 0 0 0 0 0
2001 0 0 0 0 0
2002 0 0 0 0 0
2003 0 0 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
这样所需的输出如下:
UK DE FR SP IT
2000 0 10 0 0 0
2001 0 20 0 0 0
2002 0 0 0 0 0
2003 0 30 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
我们可以使用 row/column 索引将 'value' 列从 'DE' 分配给 'df'
df[DE$date1, DE$country] <- DE$value
-输出
> df
UK DE FR SP IT
2000 0 10 0 0 0
2001 0 20 0 0 0
2002 0 0 0 0 0
2003 0 30 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
这与 akrun 大师的解决方案形成了鲜明的对比。很明显,不是他的方案可比的。出于学习目的我的想法:
df
是一个 matrix, array
class。所以把它带到 dataframe
class,而不是 tibble
因为 tibbles 不接受行名。
pivot_wider
并添加一个 right_join
- 然后做一些调整,
mutate(DE = coalesce(DE.x,DE.y), .keep="unused", .before=4)
我真的很喜欢。
- 带回
rownames
library(dplyr)
library(tidyr)
df <- df %>%
as.data.frame() %>%
rownames_to_column("date1")
DE %>%
pivot_wider(
names_from = country,
values_from = value
) %>%
right_join(df, by="date1") %>%
arrange(date1) %>%
mutate(DE = coalesce(DE.x,DE.y), .keep="unused", .before=4) %>%
column_to_rownames("date1")
UK DE FR SP IT
2000 0 10 0 0 0
2001 0 20 0 0 0
2002 0 0 0 0 0
2003 0 30 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
我正在尝试合并以下 data.table
:
DE <- structure(list(date1 = c("2000", "2001", "2003"), country = c("DE",
"DE", "DE"), value = c(10, 20, 30)), row.names = c(NA, -3L), class = c("data.table",
"data.frame"))
date1 country value
1: 2000 DE 10
2: 2001 DE 20
3: 2003 DE 30
我想将其与带 0 的矩阵合并:
df <- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = 6:5, .Dimnames = list(
c("2000", "2001", "2002", "2003", "2004", "2005"), c("UK",
"DE", "FR", "SP", "IT")))
UK DE FR SP IT
2000 0 0 0 0 0
2001 0 0 0 0 0
2002 0 0 0 0 0
2003 0 0 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
这样所需的输出如下:
UK DE FR SP IT
2000 0 10 0 0 0
2001 0 20 0 0 0
2002 0 0 0 0 0
2003 0 30 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
我们可以使用 row/column 索引将 'value' 列从 'DE' 分配给 'df'
df[DE$date1, DE$country] <- DE$value
-输出
> df
UK DE FR SP IT
2000 0 10 0 0 0
2001 0 20 0 0 0
2002 0 0 0 0 0
2003 0 30 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0
这与 akrun 大师的解决方案形成了鲜明的对比。很明显,不是他的方案可比的。出于学习目的我的想法:
df
是一个matrix, array
class。所以把它带到dataframe
class,而不是tibble
因为 tibbles 不接受行名。pivot_wider
并添加一个right_join
- 然后做一些调整,
mutate(DE = coalesce(DE.x,DE.y), .keep="unused", .before=4)
我真的很喜欢。 - 带回
rownames
library(dplyr)
library(tidyr)
df <- df %>%
as.data.frame() %>%
rownames_to_column("date1")
DE %>%
pivot_wider(
names_from = country,
values_from = value
) %>%
right_join(df, by="date1") %>%
arrange(date1) %>%
mutate(DE = coalesce(DE.x,DE.y), .keep="unused", .before=4) %>%
column_to_rownames("date1")
UK DE FR SP IT
2000 0 10 0 0 0
2001 0 20 0 0 0
2002 0 0 0 0 0
2003 0 30 0 0 0
2004 0 0 0 0 0
2005 0 0 0 0 0