如何使用 dplyr 进行分组和内部连接
How to do grouping and inner join using dplyr
我有以下数据框:
df<-data.frame(date=c("2019-01-06","2019-01-13","2019-01-20","2019-01-06","2019-01-13","2019-01-20","2019-01-06","2019-01-13","2019-01-20","2019-01-27"),
version=c("A","A","A","B","B","B","C","C","C","C"),
quantity=c(12,2,5,10,2,12,4,5,6,8),
price=c(2.2,2.3,2.2,2.4,2.3,2.6,2.2,2.5,2.4,2.8))
df
date version quantity price
1 2019-01-06 A 12 2.2
2 2019-01-13 A 2 2.3
3 2019-01-20 A 5 2.2
4 2019-01-06 B 10 2.4
5 2019-01-13 B 2 2.3
6 2019-01-20 B 12 2.6
7 2019-01-06 C 4 2.2
8 2019-01-13 C 5 2.5
9 2019-01-20 C 6 2.4
10 2019-01-27 C 8 2.8
我想按 version
列进行分组并在 date
列内部加入组(跨组通用))以获得以下过滤输出:
date quantity_A price_A quantity_B price_B quantity_C price_C
1 2019-01-06 12 2.2 10 2.4 4 2.2
2 2019-01-13 2 2.3 2 2.3 5 2.5
3 2019-01-20 5 2.2 12 2.6 6 2.4
谢谢。
您可以重塑数据并删除 NA
值,使其表现得像内部联接。
tidyr::pivot_wider(df,names_from = version, values_from = c(quantity, price)) %>%
na.omit()
# date quantity_A quantity_B quantity_C price_A price_B price_C
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2019-01-06 12 10 4 2.2 2.4 2.2
#2 2019-01-13 2 2 5 2.3 2.3 2.5
#3 2019-01-20 5 12 6 2.2 2.6 2.4
我有以下数据框:
df<-data.frame(date=c("2019-01-06","2019-01-13","2019-01-20","2019-01-06","2019-01-13","2019-01-20","2019-01-06","2019-01-13","2019-01-20","2019-01-27"),
version=c("A","A","A","B","B","B","C","C","C","C"),
quantity=c(12,2,5,10,2,12,4,5,6,8),
price=c(2.2,2.3,2.2,2.4,2.3,2.6,2.2,2.5,2.4,2.8))
df
date version quantity price
1 2019-01-06 A 12 2.2
2 2019-01-13 A 2 2.3
3 2019-01-20 A 5 2.2
4 2019-01-06 B 10 2.4
5 2019-01-13 B 2 2.3
6 2019-01-20 B 12 2.6
7 2019-01-06 C 4 2.2
8 2019-01-13 C 5 2.5
9 2019-01-20 C 6 2.4
10 2019-01-27 C 8 2.8
我想按 version
列进行分组并在 date
列内部加入组(跨组通用))以获得以下过滤输出:
date quantity_A price_A quantity_B price_B quantity_C price_C
1 2019-01-06 12 2.2 10 2.4 4 2.2
2 2019-01-13 2 2.3 2 2.3 5 2.5
3 2019-01-20 5 2.2 12 2.6 6 2.4
谢谢。
您可以重塑数据并删除 NA
值,使其表现得像内部联接。
tidyr::pivot_wider(df,names_from = version, values_from = c(quantity, price)) %>%
na.omit()
# date quantity_A quantity_B quantity_C price_A price_B price_C
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2019-01-06 12 10 4 2.2 2.4 2.2
#2 2019-01-13 2 2 5 2.3 2.3 2.5
#3 2019-01-20 5 12 6 2.2 2.6 2.4