inner_join 输出一个空数据框
inner_join outputs an empy dataframe
我有两个要合并的数据框。
df1
PlayerName Date playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI
1: Sandy Alcantara 2022-04-26 18684 @WSN MIA A 6.0 6 1 1 0 3 0 0 0 5 1.0304
2: Sandy Alcantara 2022-04-20 18684 STL MIA H 8.0 4 0 0 0 1 0 0 0 6 2.2004
3: Sandy Alcantara 2022-04-14 18684 PHI MIA H 6.1 7 2 2 0 1 0 0 0 5 1.1064
4: Sandy Alcantara 2022-04-08 18684 @SFG MIA A 5.0 3 3 2 1 5 0 0 0 4 0.2488
df2
Date PlayerName GIDP
1 2022-04-14 Alcantara, Sandy 1
2 2022-04-20 Alcantara, Sandy 1
3 2022-04-26 Alcantara, Sandy 2
使用内部联接时,输出是一个空数据帧。
game_logs <- inner_join(df1, df2, by = c("PlayerName", "Date"))
[1] Date PlayerName GIDP playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI
<0 rows> (or 0-length row.names)
我想要实现的是看起来像 df3
df1
PlayerName Date playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI GIDP
1: Sandy Alcantara 2022-04-26 18684 @WSN MIA A 6.0 6 1 1 0 3 0 0 0 5 1.0304 2
2: Sandy Alcantara 2022-04-20 18684 STL MIA H 8.0 4 0 0 0 1 0 0 0 6 2.2004 1
3: Sandy Alcantara 2022-04-14 18684 PHI MIA H 6.1 7 2 2 0 1 0 0 0 5 1.1064 1
4: Sandy Alcantara 2022-04-08 18684 @SFG MIA A 5.0 3 3 2 1 5 0 0 0 4 0.2488 NA
您需要在 df2 中重新排列名称:
df1 %>%
inner_join(mutate(df2, PlayerName = sub('(\w+), (\w+)', "\2 \1",PlayerName)),
by = c("PlayerName", "Date"))
PlayerName Date playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI GIDP
1 Sandy Alcantara 2022-04-26 18684 @WSN MIA A 6.0 6 1 1 0 3 0 0 0 5 1.0304 2
2 Sandy Alcantara 2022-04-20 18684 STL MIA H 8.0 4 0 0 0 1 0 0 0 6 2.2004 1
3 Sandy Alcantara 2022-04-14 18684 PHI MIA H 6.1 7 2 2 0 1 0 0 0 5 1.1064 1
我有两个要合并的数据框。
df1
PlayerName Date playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI
1: Sandy Alcantara 2022-04-26 18684 @WSN MIA A 6.0 6 1 1 0 3 0 0 0 5 1.0304
2: Sandy Alcantara 2022-04-20 18684 STL MIA H 8.0 4 0 0 0 1 0 0 0 6 2.2004
3: Sandy Alcantara 2022-04-14 18684 PHI MIA H 6.1 7 2 2 0 1 0 0 0 5 1.1064
4: Sandy Alcantara 2022-04-08 18684 @SFG MIA A 5.0 3 3 2 1 5 0 0 0 4 0.2488
df2
Date PlayerName GIDP
1 2022-04-14 Alcantara, Sandy 1
2 2022-04-20 Alcantara, Sandy 1
3 2022-04-26 Alcantara, Sandy 2
使用内部联接时,输出是一个空数据帧。
game_logs <- inner_join(df1, df2, by = c("PlayerName", "Date"))
[1] Date PlayerName GIDP playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI
<0 rows> (or 0-length row.names)
我想要实现的是看起来像 df3
df1
PlayerName Date playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI GIDP
1: Sandy Alcantara 2022-04-26 18684 @WSN MIA A 6.0 6 1 1 0 3 0 0 0 5 1.0304 2
2: Sandy Alcantara 2022-04-20 18684 STL MIA H 8.0 4 0 0 0 1 0 0 0 6 2.2004 1
3: Sandy Alcantara 2022-04-14 18684 PHI MIA H 6.1 7 2 2 0 1 0 0 0 5 1.1064 1
4: Sandy Alcantara 2022-04-08 18684 @SFG MIA A 5.0 3 3 2 1 5 0 0 0 4 0.2488 NA
您需要在 df2 中重新排列名称:
df1 %>%
inner_join(mutate(df2, PlayerName = sub('(\w+), (\w+)', "\2 \1",PlayerName)),
by = c("PlayerName", "Date"))
PlayerName Date playerid Opp Team HomeAway IP H R ER HR BB IBB WP BK SO exLI GIDP
1 Sandy Alcantara 2022-04-26 18684 @WSN MIA A 6.0 6 1 1 0 3 0 0 0 5 1.0304 2
2 Sandy Alcantara 2022-04-20 18684 STL MIA H 8.0 4 0 0 0 1 0 0 0 6 2.2004 1
3 Sandy Alcantara 2022-04-14 18684 PHI MIA H 6.1 7 2 2 0 1 0 0 0 5 1.1064 1