在R中如何根据相同的变量和日期期限的限制合并两个数据框

In R how to merge two dataframe according same variable and the restriction of date period

如何组合 [table_a] 和 [table_b] 如下?希望的结果为 [table_c] 限制是: table_a$category=table_b$categorytable_a$date %in% range(start_date,end_date) 任何人都可以帮忙吗?谢谢!

table_a <- data.frame(date=c(44197,44259,44565,44354,44449,44385,44409),
           category=c("a","a","a","b","b","c","c")) %>% mutate(date=as.Date(date,'1899-12-30'))


table_b <- data.frame(category=c("a","a","a","b","b","c"),
                      start_date=c(43466,44288,44563,44197,44416,42736),
                      end_date=c(44287,44562,52963,44415,52963,52963),
                      seller=c("Allen","Grece","Lemon","Ally","Sam","Candy")) %>% 
  mutate(start_date=as.Date(start_date,'1899-12-30'),
         end_date=as.Date(end_date,'1899-12-30'))

使用 dplyr 的解决方案:

table_c <- table_a %>% 
  left_join(., table_b, by="category") %>% 
  rowwise() %>% 
  filter(., date %in% seq(start_date, end_date, 1)) %>%
  select(-start_date, -end_date) %>% 
  as.data.frame()

       date category seller
1 2021-01-01        a  Allen
2 2021-03-04        a  Allen
3 2022-01-04        a  Lemon
4 2021-06-07        b   Ally
5 2021-09-10        b    Sam
6 2021-07-08        c  Candy
7 2021-08-01        c  Candy

您可以使用dplyr::left_join

library(dplyr)
table_a %>%
  left_join(table_b, by='category')%>%
  filter(date>=start_date & date<=end_date) %>% 
    select(date, category, start_date, seller)