在 R 中使用 dplyr 重塑 table

Reshaping a table with dplyr in R

欢迎就 dplyr 在 R 中的正确应用提出一些建议。 我们有以下数据:

   City            Amount    Category
1  Los Angeles     100       Film
2  Los Angeles     200       Film
3  Los Angeles     400       Music 
4  Seattle         300       Coffee
5  Boston          600       Books
...

最终结果应如下所示:

                        Film   Coffee   Books   ...
City  
Los Angeles, CA         Sum    Sum      Sum     Sum 
Seattle, WA             Sum    Sum      Sum     Sum 
Boston, MA              Sum    Sum      Sum     Sum  

我希望数据透视表 table 汇总每个城市中每个类别的 "Amount" 的总值,以便城市在一列中位于左侧,所有类别在顶部作为一行.

尝试过:

data %>%                                            
  group_by(Location, Category) %>%
  summarise(Amount = sum(Amount))

哪个看起来更像

   City            Amount    Category
1  Los Angeles     300       Film
3  Los Angeles     400       Music 
4  Seattle         300       Coffee
5  Boston          600       Books

计算是正确的,但如前所述,我们需要将城市和类别作为矩阵,其中每个单元格内的每个金额之和。

感谢您的帮助!

您正在寻找的是 tidyr::spread 将您的 data.frame 从长格式重塑为宽格式:

library(tidyverse)

# recreate the data
data <- tribble(
  ~City,             ~Amount,   ~Category,
  "Los Angeles",     100,       "Film",
  "Los Angeles",     200,       "Film",
  "Los Angeles",     400,       "Music", 
  "Seattle",         300,       "Coffee",
  "Boston",          600,       "Books"
)

# using your code to get the data in the long-format
data_long <- data %>% 
  group_by(City, Category) %>%
  summarise(Amount = sum(Amount))

data_long
#> # A tibble: 4 x 3
#> # Groups:   City [?]
#>          City Category Amount
#>         <chr>    <chr>  <dbl>
#> 1      Boston    Books    600
#> 2 Los Angeles     Film    300
#> 3 Los Angeles    Music    400
#> 4     Seattle   Coffee    300

# spread to wide using the tidyr-package (in tidyverse)
data_wide <- spread(data_long, key = "Category", value = "Amount", fill = 0)

data_wide
#> # A tibble: 3 x 5
#> # Groups:   City [3]
#>          City Books Coffee  Film Music
#> *       <chr> <dbl>  <dbl> <dbl> <dbl>
#> 1      Boston   600      0     0     0
#> 2 Los Angeles     0      0   300   400
#> 3     Seattle     0    300     0     0

走向矩阵

mat <- as.matrix(data_wide %>% ungroup %>% select(-City))
rownames(mat) <- data_wide$City

mat
#>             Books Coffee Film Music
#> Boston        600      0    0     0
#> Los Angeles     0      0  300   400
#> Seattle         0    300    0     0

str(mat)
#>  num [1:3, 1:4] 600 0 0 0 0 300 0 300 0 0 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:3] "Boston" "Los Angeles" "Seattle"
#>   ..$ : chr [1:4] "Books" "Coffee" "Film" "Music"