如何在 SQL 服务器中使用 CROSS APPLY 在 R 中做同样的事情?
How to do the same thing in R that we do using CROSS APPLY in SQL Server?
这就是运行内部数据框中每一行的外部数据框中每一行的条件)我有两个数据框:
测试
Origin.State Dest.State Ship.Date Cost
1 IL NY 2015-03-25 10
2 IL NY 2015-03-25 10
3 IL NY 2015-03-24 10
4 IL NY 2015-03-23 10
5 IL NY 2015-03-18 10
6 PA NY 2015-04-29 10
7 PA NY 2015-04-29 10
8 PA NY 2015-04-27 10
9 PA NY 2015-04-24 10
10 PA NY 2015-03-01 10
11 IL TX 2015-05-18 10
12 IL TX 2015-05-18 10
13 IL TX 2015-05-14 10
14 IL TX 2015-05-12 10
15 IL TX 2015-05-13 10
TestShipmentGroup1
Origin.State Dest.State Ship.Date
1 IL NY 2015-03-25
2 IL NY 2015-03-24
3 IL NY 2015-03-23
4 IL NY 2015-03-18
5 PA NY 2015-04-29
6 PA NY 2015-04-27
7 PA NY 2015-04-24
8 PA NY 2015-03-01
9 IL TX 2015-05-18
10 IL TX 2015-05-14
11 IL TX 2015-05-12
12 IL TX 2015-05-13
我正在尝试一次使用 TestShipmentGroup1 数据帧的每一行将下面显示的条件应用于 ToTest 数据帧的每一行.
for (i in 1: nrow(TestShipmentGroup1))
{
TestShipmentGroup1%>%
select(Origin.State,Dest.State,Ship.Date)
ToTest%>%
select(Origin.State, Dest.State,Ship.Date,Cost) %>%
filter (((ToTest$Ship.Date >= (TestShipmentGroup1$Ship.Date-7))
& (ToTest$Ship.Date < TestShipmentGroup1$Ship.Date))
& (ToTest$Origin.State == TestShipmentGroup1$Origin.State)
& (ToTest$Dest.State == TestShipmentGroup1$Dest.State))}
考虑使用 merge
没有连接变量的交叉连接(从两个集合返回笛卡尔乘积 M X N),然后应用过滤条件。或者,之后在 States 上合并过滤器的内部联接也可以。但首先重命名列以避免冲突:
library(dplyr)
...
names(ToTest) <- paste0(names(ToTest), "1")
names(TestShipmentGroup1) <- paste0(names(TestShipmentGroup1), "2")
# CROSS JOIN WITH FILTER
finaldf <- merge(select(ToTest, Origin.State1, Dest.State1, Ship.Date1),
select(TestShipmentGroup1, Origin.State2, Dest.State2, Ship.Date2)),
all=TRUE) %>%
filter (((Ship.Date1 >= (Ship.Date2-as.difftime(7, unit="days")))
& (Ship.Date1 < Ship.Date2))
& (Origin.State1 == Origin.State2)
& (Dest.State1 == Dest.State2))
# INNER JOIN WITH FILTER
finaldf <- inner_join(select(ToTest, Origin.State1, Dest.State1, Ship.Date1),
select(TestShipmentGroup1, Origin.State2, Dest.State2, Ship.Date2),
by = c("Origin.State1"="Origin.State2", "Dest.State1"="Dest.State2")) %>%
filter ((Ship.Date1 >= (Ship.Date2-as.difftime(7, unit="days")))
& (Ship.Date1 < Ship.Date2))
这就是运行内部数据框中每一行的外部数据框中每一行的条件)我有两个数据框:
测试
Origin.State Dest.State Ship.Date Cost
1 IL NY 2015-03-25 10
2 IL NY 2015-03-25 10
3 IL NY 2015-03-24 10
4 IL NY 2015-03-23 10
5 IL NY 2015-03-18 10
6 PA NY 2015-04-29 10
7 PA NY 2015-04-29 10
8 PA NY 2015-04-27 10
9 PA NY 2015-04-24 10
10 PA NY 2015-03-01 10
11 IL TX 2015-05-18 10
12 IL TX 2015-05-18 10
13 IL TX 2015-05-14 10
14 IL TX 2015-05-12 10
15 IL TX 2015-05-13 10
TestShipmentGroup1
Origin.State Dest.State Ship.Date
1 IL NY 2015-03-25
2 IL NY 2015-03-24
3 IL NY 2015-03-23
4 IL NY 2015-03-18
5 PA NY 2015-04-29
6 PA NY 2015-04-27
7 PA NY 2015-04-24
8 PA NY 2015-03-01
9 IL TX 2015-05-18
10 IL TX 2015-05-14
11 IL TX 2015-05-12
12 IL TX 2015-05-13
我正在尝试一次使用 TestShipmentGroup1 数据帧的每一行将下面显示的条件应用于 ToTest 数据帧的每一行.
for (i in 1: nrow(TestShipmentGroup1))
{
TestShipmentGroup1%>%
select(Origin.State,Dest.State,Ship.Date)
ToTest%>%
select(Origin.State, Dest.State,Ship.Date,Cost) %>%
filter (((ToTest$Ship.Date >= (TestShipmentGroup1$Ship.Date-7))
& (ToTest$Ship.Date < TestShipmentGroup1$Ship.Date))
& (ToTest$Origin.State == TestShipmentGroup1$Origin.State)
& (ToTest$Dest.State == TestShipmentGroup1$Dest.State))}
考虑使用 merge
没有连接变量的交叉连接(从两个集合返回笛卡尔乘积 M X N),然后应用过滤条件。或者,之后在 States 上合并过滤器的内部联接也可以。但首先重命名列以避免冲突:
library(dplyr)
...
names(ToTest) <- paste0(names(ToTest), "1")
names(TestShipmentGroup1) <- paste0(names(TestShipmentGroup1), "2")
# CROSS JOIN WITH FILTER
finaldf <- merge(select(ToTest, Origin.State1, Dest.State1, Ship.Date1),
select(TestShipmentGroup1, Origin.State2, Dest.State2, Ship.Date2)),
all=TRUE) %>%
filter (((Ship.Date1 >= (Ship.Date2-as.difftime(7, unit="days")))
& (Ship.Date1 < Ship.Date2))
& (Origin.State1 == Origin.State2)
& (Dest.State1 == Dest.State2))
# INNER JOIN WITH FILTER
finaldf <- inner_join(select(ToTest, Origin.State1, Dest.State1, Ship.Date1),
select(TestShipmentGroup1, Origin.State2, Dest.State2, Ship.Date2),
by = c("Origin.State1"="Origin.State2", "Dest.State1"="Dest.State2")) %>%
filter ((Ship.Date1 >= (Ship.Date2-as.difftime(7, unit="days")))
& (Ship.Date1 < Ship.Date2))