R:Pivot_Wider/spread 通过获取按年排序的平均值
R: Pivot_Wider/spread by obtaining average sorted by year
我有以下数据集
Pet Shop
Year
Item
Price
A
2021
dog
300
A
2021
dog
250
A
2021
fish
20
A
2020
turtle
50
A
2020
dog
250
A
2020
cat
280
A
2019
rabbit
180
A
2019
cat
165
A
2019
cat
270
B
2021
dog
350
B
2021
fish
80
B
2021
fish
70
B
2020
cat
220
B
2020
turtle
90
B
2020
turtle
80
B
2020
fish
55
B
2019
fish
75
C
2021
dog
280
C
2020
cat
260
C
2020
cat
270
C
2019
fish
65
C
2019
cat
270
数据的代码如下
Pet_Shop = c(rep("A",9), rep("B",8), rep("C",5))
Item = c("Dog","Dog","Fish","Turtle","Dog","Cat","Rabbit","Cat","Cat","Dog","Fish","Fish","Cat","Turtle","Turtle","Fish","Fish","Dog","Cat","Cat","Fish","Cat")
Price = c(300,250,20,50,250,280,180,165,270,350,80,70,220,90,80,55,75,280,260,270,65,270)
Data = data.frame(Pet_Shop, Item, Price)
这里有谁知道我如何使用 pivot_wider 或传播(或任何其他方法)来实现以下 table?它按年份对商店进行分组,并取当年同一商店的类似商品的平均值。我有关于年份的问题。
Pet Shop
Year
dog
fish
turtle
cat
rabbit
A
2021
Average(300,250) = 275
20
NA
NA
NA
A
2020
250
NA
50
280
NA
A
2019
NA
NA
NA
217.5
NA
B
2021
350
75
NA
NA
NA
B
2020
NA
55
85
220
NA
B
2019
NA
75
NA
NA
NA
C
2021
280
NA
NA
NA
NA
C
2020
NA
NA
NA
265
NA
C
2019
NA
60
NA
270
NA
在 pivot_wider
中,您可以传递一个函数 (values_fn
) 以应用于 Pet_Shop
和 Year
的每个组合。
result <- tidyr::pivot_wider(Data, names_from = Item,
values_from = Price, values_fn = mean)
result
# Pet_Shop Year dog fish turtle cat rabbit
# <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A 2021 275 20 NA NA NA
#2 A 2020 250 NA 50 280 NA
#3 A 2019 NA NA NA 218. 180
#4 B 2021 350 75 NA NA NA
#5 B 2020 NA 55 85 220 NA
#6 B 2019 NA 75 NA NA NA
#7 C 2021 280 NA NA NA NA
#8 C 2020 NA NA NA 265 NA
#9 C 2019 NA 65 NA 270 NA
同样可以用data.table
dcast
-
library(data.table)
dcast(setDT(Data), Pet_Shop + Year ~ Item,
value.var = "Price", fun.aggregate = mean)
我有以下数据集
Pet Shop | Year | Item | Price |
---|---|---|---|
A | 2021 | dog | 300 |
A | 2021 | dog | 250 |
A | 2021 | fish | 20 |
A | 2020 | turtle | 50 |
A | 2020 | dog | 250 |
A | 2020 | cat | 280 |
A | 2019 | rabbit | 180 |
A | 2019 | cat | 165 |
A | 2019 | cat | 270 |
B | 2021 | dog | 350 |
B | 2021 | fish | 80 |
B | 2021 | fish | 70 |
B | 2020 | cat | 220 |
B | 2020 | turtle | 90 |
B | 2020 | turtle | 80 |
B | 2020 | fish | 55 |
B | 2019 | fish | 75 |
C | 2021 | dog | 280 |
C | 2020 | cat | 260 |
C | 2020 | cat | 270 |
C | 2019 | fish | 65 |
C | 2019 | cat | 270 |
数据的代码如下
Pet_Shop = c(rep("A",9), rep("B",8), rep("C",5))
Item = c("Dog","Dog","Fish","Turtle","Dog","Cat","Rabbit","Cat","Cat","Dog","Fish","Fish","Cat","Turtle","Turtle","Fish","Fish","Dog","Cat","Cat","Fish","Cat")
Price = c(300,250,20,50,250,280,180,165,270,350,80,70,220,90,80,55,75,280,260,270,65,270)
Data = data.frame(Pet_Shop, Item, Price)
这里有谁知道我如何使用 pivot_wider 或传播(或任何其他方法)来实现以下 table?它按年份对商店进行分组,并取当年同一商店的类似商品的平均值。我有关于年份的问题。
Pet Shop | Year | dog | fish | turtle | cat | rabbit |
---|---|---|---|---|---|---|
A | 2021 | Average(300,250) = 275 | 20 | NA | NA | NA |
A | 2020 | 250 | NA | 50 | 280 | NA |
A | 2019 | NA | NA | NA | 217.5 | NA |
B | 2021 | 350 | 75 | NA | NA | NA |
B | 2020 | NA | 55 | 85 | 220 | NA |
B | 2019 | NA | 75 | NA | NA | NA |
C | 2021 | 280 | NA | NA | NA | NA |
C | 2020 | NA | NA | NA | 265 | NA |
C | 2019 | NA | 60 | NA | 270 | NA |
在 pivot_wider
中,您可以传递一个函数 (values_fn
) 以应用于 Pet_Shop
和 Year
的每个组合。
result <- tidyr::pivot_wider(Data, names_from = Item,
values_from = Price, values_fn = mean)
result
# Pet_Shop Year dog fish turtle cat rabbit
# <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A 2021 275 20 NA NA NA
#2 A 2020 250 NA 50 280 NA
#3 A 2019 NA NA NA 218. 180
#4 B 2021 350 75 NA NA NA
#5 B 2020 NA 55 85 220 NA
#6 B 2019 NA 75 NA NA NA
#7 C 2021 280 NA NA NA NA
#8 C 2020 NA NA NA 265 NA
#9 C 2019 NA 65 NA 270 NA
同样可以用data.table
dcast
-
library(data.table)
dcast(setDT(Data), Pet_Shop + Year ~ Item,
value.var = "Price", fun.aggregate = mean)