如何在 SQL 服务器中使用 CROSS APPLY 在 R 中做同样的事情?

How to do the same thing in R that we do using CROSS APPLY in SQL Server?

这就是运行内部数据框中每一行的外部数据框中每一行的条件)我有两个数据框:

测试

   Origin.State Dest.State  Ship.Date Cost
1            IL         NY 2015-03-25   10
2            IL         NY 2015-03-25   10
3            IL         NY 2015-03-24   10
4            IL         NY 2015-03-23   10
5            IL         NY 2015-03-18   10
6            PA         NY 2015-04-29   10
7            PA         NY 2015-04-29   10
8            PA         NY 2015-04-27   10
9            PA         NY 2015-04-24   10
10           PA         NY 2015-03-01   10
11           IL         TX 2015-05-18   10
12           IL         TX 2015-05-18   10
13           IL         TX 2015-05-14   10
14           IL         TX 2015-05-12   10
15           IL         TX 2015-05-13   10

TestShipmentGroup1

   Origin.State Dest.State  Ship.Date
1            IL         NY 2015-03-25
2            IL         NY 2015-03-24
3            IL         NY 2015-03-23
4            IL         NY 2015-03-18
5            PA         NY 2015-04-29
6            PA         NY 2015-04-27
7            PA         NY 2015-04-24
8            PA         NY 2015-03-01
9            IL         TX 2015-05-18
10           IL         TX 2015-05-14
11           IL         TX 2015-05-12
12           IL         TX 2015-05-13

我正在尝试一次使用 TestShipmentGroup1 数据帧的每一行将下面显示的条件应用于 ToTest 数据帧的每一行.

for (i in 1: nrow(TestShipmentGroup1))
{
TestShipmentGroup1%>%
  select(Origin.State,Dest.State,Ship.Date)
ToTest%>%
  select(Origin.State, Dest.State,Ship.Date,Cost) %>% 
  filter (((ToTest$Ship.Date >= (TestShipmentGroup1$Ship.Date-7)) 
           & (ToTest$Ship.Date < TestShipmentGroup1$Ship.Date))
          & (ToTest$Origin.State == TestShipmentGroup1$Origin.State)
          & (ToTest$Dest.State == TestShipmentGroup1$Dest.State))}

考虑使用 merge 没有连接变量的交叉连接(从两个集合返回笛卡尔乘积 M X N),然后应用过滤条件。或者,之后在 States 上合并过滤器的内部联接也可以。但首先重命名列以避免冲突:

library(dplyr)

...

names(ToTest) <- paste0(names(ToTest), "1")
names(TestShipmentGroup1) <- paste0(names(TestShipmentGroup1), "2")

# CROSS JOIN WITH FILTER
finaldf <- merge(select(ToTest, Origin.State1, Dest.State1, Ship.Date1),
                 select(TestShipmentGroup1, Origin.State2, Dest.State2, Ship.Date2)),
                 all=TRUE) %>%
                          filter (((Ship.Date1 >= (Ship.Date2-as.difftime(7, unit="days")))
                                  & (Ship.Date1 < Ship.Date2))
                                  & (Origin.State1 == Origin.State2)
                                  & (Dest.State1 == Dest.State2))

# INNER JOIN WITH FILTER
finaldf <- inner_join(select(ToTest, Origin.State1, Dest.State1, Ship.Date1),
                      select(TestShipmentGroup1, Origin.State2, Dest.State2, Ship.Date2), 
                 by = c("Origin.State1"="Origin.State2", "Dest.State1"="Dest.State2")) %>%
                          filter ((Ship.Date1 >= (Ship.Date2-as.difftime(7, unit="days")))
                                  & (Ship.Date1 < Ship.Date2))