将日期与 pivot_wider 分开
Splitting the date alongside pivot_wider
我有以下 table.
Date
Cat
15/2/1999
A
15/2/1999
A
15/2/1999
B
15/5/1999
A
15/5/1999
B
15/10/1999
C
15/10/1999
C
15/2/2001
A
15/2/2001
A
15/6/2001
B
15/6/2001
B
15/6/2001
C
15/11/2001
C
15/11/2001
C
我想对其应用 pivot_wider(或任何其他类似函数),并考虑如下所示的日期和年份列。正在根据变量 A、B 和 C 拆分 Cat 列并显示计数。
Month
Year
A
B
C
Total
February
1999
2
1
0
3
May
1999
1
1
0
2
October
1999
0
0
2
2
February
2001
2
0
0
2
June
2001
0
2
1
3
November
2001
0
0
2
2
这里有谁知道我怎样才能同时做到这两点?谢谢
您可以使用 tidyverse
包来做到这一点。首先,将日期列格式化为日期,然后按月计数,转向更宽并格式化 table.
library(tidyverse)
data %>%
mutate(Date = as.Date(Date, format = "%d/%m/%Y")) %>%
group_by(Cat, month = lubridate::floor_date(Date, "month")) %>%
count(Cat) %>%
pivot_wider(names_from = Cat, values_from = n, values_fill = 0) %>%
mutate(year = year(month), .before = "A",
month = month(month, label = T, abbr = F)) %>%
mutate(Total = rowSums(across(A:C))) %>%
arrange(year)
month year A B C Total
<ord> <dbl> <int> <int> <int> <dbl>
1 February 1999 2 1 0 3
2 May 1999 1 1 0 2
3 October 1999 0 0 2 2
4 February 2001 2 0 0 2
5 June 2001 0 2 1 3
6 November 2001 0 0 2 2
数据
data <- structure(list(Date = c("15/2/1999", "15/2/1999", "15/2/1999",
"15/5/1999", "15/5/1999", "15/10/1999", "15/10/1999", "15/2/2001",
"15/2/2001", "15/6/2001", "15/6/2001", "15/6/2001", "15/11/2001",
"15/11/2001"), Cat = c("A", "A", "B", "A", "B", "C", "C", "A",
"A", "B", "B", "C", "C", "C")), class = "data.frame", row.names = c(NA,
-14L))
另一个可能的解决方案:
library(tidyverse)
library(lubridate)
df <- data.frame(
stringsAsFactors = FALSE,
Date = c("15/2/1999",
"15/2/1999","15/2/1999","15/5/1999","15/5/1999",
"15/10/1999","15/10/1999","15/2/2001","15/2/2001",
"15/6/2001","15/6/2001","15/6/2001","15/11/2001",
"15/11/2001"),
Cat = c("A","A","B","A",
"B","C","C","A","A","B","B","C","C","C")
)
df %>%
mutate(Month = month(Date, label = TRUE), Year = year(dmy(Date))) %>%
pivot_wider(id_cols = c(Month, Year), names_from = Cat,
values_from = Cat, values_fn = length, values_fill = 0) %>%
mutate(Total = rowSums(.[3:5]))
#> # A tibble: 6 × 6
#> Month Year A B C Total
#> <ord> <dbl> <int> <int> <int> <dbl>
#> 1 Feb 1999 2 1 0 3
#> 2 May 1999 1 1 0 2
#> 3 Oct 1999 0 0 2 2
#> 4 Feb 2001 2 0 0 2
#> 5 Jun 2001 0 2 1 3
#> 6 Nov 2001 0 0 2 2
我有以下 table.
Date | Cat |
---|---|
15/2/1999 | A |
15/2/1999 | A |
15/2/1999 | B |
15/5/1999 | A |
15/5/1999 | B |
15/10/1999 | C |
15/10/1999 | C |
15/2/2001 | A |
15/2/2001 | A |
15/6/2001 | B |
15/6/2001 | B |
15/6/2001 | C |
15/11/2001 | C |
15/11/2001 | C |
我想对其应用 pivot_wider(或任何其他类似函数),并考虑如下所示的日期和年份列。正在根据变量 A、B 和 C 拆分 Cat 列并显示计数。
Month | Year | A | B | C | Total |
---|---|---|---|---|---|
February | 1999 | 2 | 1 | 0 | 3 |
May | 1999 | 1 | 1 | 0 | 2 |
October | 1999 | 0 | 0 | 2 | 2 |
February | 2001 | 2 | 0 | 0 | 2 |
June | 2001 | 0 | 2 | 1 | 3 |
November | 2001 | 0 | 0 | 2 | 2 |
这里有谁知道我怎样才能同时做到这两点?谢谢
您可以使用 tidyverse
包来做到这一点。首先,将日期列格式化为日期,然后按月计数,转向更宽并格式化 table.
library(tidyverse)
data %>%
mutate(Date = as.Date(Date, format = "%d/%m/%Y")) %>%
group_by(Cat, month = lubridate::floor_date(Date, "month")) %>%
count(Cat) %>%
pivot_wider(names_from = Cat, values_from = n, values_fill = 0) %>%
mutate(year = year(month), .before = "A",
month = month(month, label = T, abbr = F)) %>%
mutate(Total = rowSums(across(A:C))) %>%
arrange(year)
month year A B C Total
<ord> <dbl> <int> <int> <int> <dbl>
1 February 1999 2 1 0 3
2 May 1999 1 1 0 2
3 October 1999 0 0 2 2
4 February 2001 2 0 0 2
5 June 2001 0 2 1 3
6 November 2001 0 0 2 2
数据
data <- structure(list(Date = c("15/2/1999", "15/2/1999", "15/2/1999",
"15/5/1999", "15/5/1999", "15/10/1999", "15/10/1999", "15/2/2001",
"15/2/2001", "15/6/2001", "15/6/2001", "15/6/2001", "15/11/2001",
"15/11/2001"), Cat = c("A", "A", "B", "A", "B", "C", "C", "A",
"A", "B", "B", "C", "C", "C")), class = "data.frame", row.names = c(NA,
-14L))
另一个可能的解决方案:
library(tidyverse)
library(lubridate)
df <- data.frame(
stringsAsFactors = FALSE,
Date = c("15/2/1999",
"15/2/1999","15/2/1999","15/5/1999","15/5/1999",
"15/10/1999","15/10/1999","15/2/2001","15/2/2001",
"15/6/2001","15/6/2001","15/6/2001","15/11/2001",
"15/11/2001"),
Cat = c("A","A","B","A",
"B","C","C","A","A","B","B","C","C","C")
)
df %>%
mutate(Month = month(Date, label = TRUE), Year = year(dmy(Date))) %>%
pivot_wider(id_cols = c(Month, Year), names_from = Cat,
values_from = Cat, values_fn = length, values_fill = 0) %>%
mutate(Total = rowSums(.[3:5]))
#> # A tibble: 6 × 6
#> Month Year A B C Total
#> <ord> <dbl> <int> <int> <int> <dbl>
#> 1 Feb 1999 2 1 0 3
#> 2 May 1999 1 1 0 2
#> 3 Oct 1999 0 0 2 2
#> 4 Feb 2001 2 0 0 2
#> 5 Jun 2001 0 2 1 3
#> 6 Nov 2001 0 0 2 2