格式化 R 中表格的分组数据

Formatting grouped data for tables in R

我正在尝试以 table 格式显示我的数据,但我不知道如何重新排列我的数据以以正确的格式显示它。我习惯于为绘图整理数据,但在准备 table 时我发现自己有点迷茫。这似乎是一些非常基本的东西,但我无法在这里找到我做错了什么的解释。

我有 3 列数据,TypeYearn。现在格式化的数据生成如下所示的 table:

Type    Year    n
Type C  1   5596
Type D  1   1119
Type E  1   116
Type A  1   402
Type F  1   1614
Type B  1   105
Type C  2   26339
Type D  2   14130
Type E  2   98
Type A  2   3176
Type F  2   3071
Type B  2   88

我想要做的是将 Type 作为行名,Year 作为列名,n 填充 table 内容,如下所示:

         1      2        
Type A   402    3176   
Type B   105    88
Type C   26339  5596
Type D   1119   14130
Type E   116    98
Type F   1614   3071

这个错误可能是在上游造成的。使用完整的原始数据集,我通过执行以下操作得到了这个输出:

exampletable <- df %>%
  group_by(Year) %>%
  count(Type) %>%
  select(Type, Year, n)

这是dput()输出

structure(list(Type = c("Type C", "Type D", "Type E", "Type A", 
"Type F", "Type B", "Type C", "Type D", "Type E", "Type A", "Type F", 
"Type B", "Type C", "Type D", "Type E", "Type A", "Type F", "Type B", 
"Type C", "Type D", "Type E", "Type A", "Type F", "Type B", "Type C", 
"Type D", "Type E"), Year = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 
2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5), n = c(5596, 
1119, 116, 402, 1614, 105, 26339, 14130, 98, 3176, 3071, 88, 
40958, 17578, 104, 3904, 3170, 102, 33145, 23800, 93, 1264, 7084, 
1262, 34642, 24911, 504)), class = c("spec_tbl_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -27L), spec = structure(list(
    cols = list(Type = structure(list(), class = c("collector_character", 
    "collector")), Year = structure(list(), class = c("collector_double", 
    "collector")), n = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))

您可以获取宽格式的数据并将Type列更改为行名。

tidyr::pivot_wider(df, names_from = Year, values_from = n) %>%
   tibble::column_to_rownames('Type')

#          1     2     3     4     5
#Type C 5596 26339 40958 33145 34642
#Type D 1119 14130 17578 23800 24911
#Type E  116    98   104    93   504
#Type A  402  3176  3904  1264    NA
#Type F 1614  3071  3170  7084    NA
#Type B  105    88   102  1262    NA

您可以使用 tidyr 包来获得更宽的格式,并使用 tibble 包将列转换为行名

dataset <- read.csv(file_location)
dataset <- tidyr::pivot_wider(dataset, names_from = Year, values_from = n)

tibble::column_to_rownames(dataset, var = 'Type')
       1     2
Type C 5596 26339
Type D 1119 14130
Type E  116    98
Type A  402  3176
Type F 1614  3071
Type B  105    88