将日期与 pivot_wider 分开

Splitting the date alongside pivot_wider

我有以下 table.

Date Cat
15/2/1999 A
15/2/1999 A
15/2/1999 B
15/5/1999 A
15/5/1999 B
15/10/1999 C
15/10/1999 C
15/2/2001 A
15/2/2001 A
15/6/2001 B
15/6/2001 B
15/6/2001 C
15/11/2001 C
15/11/2001 C

我想对其应用 pivot_wider(或任何其他类似函数),并考虑如下所示的日期和年份列。正在根据变量 A、B 和 C 拆分 Cat 列并显示计数。

Month Year A B C Total
February 1999 2 1 0 3
May 1999 1 1 0 2
October 1999 0 0 2 2
February 2001 2 0 0 2
June 2001 0 2 1 3
November 2001 0 0 2 2

这里有谁知道我怎样才能同时做到这两点?谢谢

您可以使用 tidyverse 包来做到这一点。首先,将日期列格式化为日期,然后按月计数,转向更宽并格式化 table.

library(tidyverse)
data %>% 
  mutate(Date = as.Date(Date, format = "%d/%m/%Y")) %>% 
  group_by(Cat, month = lubridate::floor_date(Date, "month")) %>% 
  count(Cat) %>% 
  pivot_wider(names_from = Cat, values_from = n, values_fill = 0) %>% 
  mutate(year = year(month), .before = "A",
         month = month(month, label = T, abbr = F)) %>% 
  mutate(Total = rowSums(across(A:C))) %>% 
  arrange(year)

  month     year     A     B     C Total
  <ord>    <dbl> <int> <int> <int> <dbl>
1 February  1999     2     1     0     3
2 May       1999     1     1     0     2
3 October   1999     0     0     2     2
4 February  2001     2     0     0     2
5 June      2001     0     2     1     3
6 November  2001     0     0     2     2

数据

data <- structure(list(Date = c("15/2/1999", "15/2/1999", "15/2/1999", 
"15/5/1999", "15/5/1999", "15/10/1999", "15/10/1999", "15/2/2001", 
"15/2/2001", "15/6/2001", "15/6/2001", "15/6/2001", "15/11/2001", 
"15/11/2001"), Cat = c("A", "A", "B", "A", "B", "C", "C", "A", 
"A", "B", "B", "C", "C", "C")), class = "data.frame", row.names = c(NA, 
-14L))

另一个可能的解决方案:

library(tidyverse)
library(lubridate)

df <- data.frame(
  stringsAsFactors = FALSE,
  Date = c("15/2/1999",
           "15/2/1999","15/2/1999","15/5/1999","15/5/1999",
           "15/10/1999","15/10/1999","15/2/2001","15/2/2001",
           "15/6/2001","15/6/2001","15/6/2001","15/11/2001",
           "15/11/2001"),
  Cat = c("A","A","B","A",
          "B","C","C","A","A","B","B","C","C","C")
)

df %>% 
  mutate(Month = month(Date, label = TRUE), Year = year(dmy(Date))) %>% 
  pivot_wider(id_cols = c(Month, Year), names_from = Cat,
       values_from = Cat, values_fn = length, values_fill = 0) %>% 
  mutate(Total = rowSums(.[3:5]))

#> # A tibble: 6 × 6
#>   Month  Year     A     B     C Total
#>   <ord> <dbl> <int> <int> <int> <dbl>
#> 1 Feb    1999     2     1     0     3
#> 2 May    1999     1     1     0     2
#> 3 Oct    1999     0     0     2     2
#> 4 Feb    2001     2     0     0     2
#> 5 Jun    2001     0     2     1     3
#> 6 Nov    2001     0     0     2     2