使用 Count 在 R 中进行数据透视
pivottable in R using Count
我有这个数据集:
created On | status
---------------------
2021-10-21 | complete
2021-10-21 | complete
2021-10-21 | partial complete
2021-10-21 | on going
2021-10-20 | partial complete
2021-10-20 | on going
2021-10-19 | complete
我正在尝试创建一个像 excel 中那样的枢轴 table,预期输出如下所示:
created on | complete | partial complete | on going | total | percent (complete+partial)/total
-----------------------------------------------------------------------------------------------------------
2021-10-21 | 2 | 1 | 1 | 4 | 50
2021-10-20 | 0 | 1 | 1 | 2 | 50
2021-10-19 | 1 | 0 | 0 | 1 | 100
当我尝试以下操作时:
daily_numbers1 %>%
pivot_table(
.rows = ~ `created on`,
.columns = ~ status,
.values = ~ status
)
我得到了以下输出:
created on | complete | partial complete | on going | total | percent (complete+partial)/total
-----------------------------------------------------------------------------------------------------------
2021-10-21 | <chr [2]> | <chr[1]> | <chr [1]> | 4 | 50
2021-10-20 | <NULL> | <chr [1]> | <chr [1]> | 2 | 50
看起来我显示的是字符而不是数字计数。我仍在考虑如何添加总计和百分比列。如果你也可以添加它。
我试过的另一种方法:
daily_numbers1 %>%
group_by(`Created On`) %>%
summarize(status = count(status))
我得到了这个:
status | freq
---------------------
complete | 3
partial | 2
on going | 2
远非我所需要的,按日期分组。
在base R
中,这更容易
out <- addmargins(table(daily_numbers1), 2)
cbind(out, percent = rowSums(out[, c('complete', 'partial complete')])/
out[, 'Sum'])
-输出
# complete on going partial complete Sum percent
#2021-10-19 1 0 0 1 1.00
#2021-10-20 0 1 1 2 0.50
#2021-10-21 2 1 1 4 0.75
或使用janitor
library(janitor)
library(dplyr)
daily_numbers1 %>%
tabyl(createdOn, status) %>%
adorn_totals("col") %>%
mutate(Perc = (complete + `partial complete`)/Total)
# createdOn complete on going partial complete Total Perc
# 2021-10-19 1 0 0 1 1.00
# 2021-10-20 0 1 1 2 0.50
# 2021-10-21 2 1 1 4 0.75
或者可以使用 pivot_wider
和 adorn_totals
library(tidyr)
daily_numbers1 %>%
pivot_wider(names_from = status, values_from = status,
values_fn = length, values_fill = 0) %>%
adorn_totals('col') %>%
mutate(Perc = (complete + `partial complete`)/Total)
数据
daily_numbers1 <- structure(list(createdOn = c("2021-10-21", "2021-10-21",
"2021-10-21",
"2021-10-21", "2021-10-20", "2021-10-20", "2021-10-19"), status = c("complete",
"complete", "partial complete", "on going", "partial complete",
"on going", "complete")), class = "data.frame", row.names = c(NA,
-7L))
我有这个数据集:
created On | status
---------------------
2021-10-21 | complete
2021-10-21 | complete
2021-10-21 | partial complete
2021-10-21 | on going
2021-10-20 | partial complete
2021-10-20 | on going
2021-10-19 | complete
我正在尝试创建一个像 excel 中那样的枢轴 table,预期输出如下所示:
created on | complete | partial complete | on going | total | percent (complete+partial)/total
-----------------------------------------------------------------------------------------------------------
2021-10-21 | 2 | 1 | 1 | 4 | 50
2021-10-20 | 0 | 1 | 1 | 2 | 50
2021-10-19 | 1 | 0 | 0 | 1 | 100
当我尝试以下操作时:
daily_numbers1 %>%
pivot_table(
.rows = ~ `created on`,
.columns = ~ status,
.values = ~ status
)
我得到了以下输出:
created on | complete | partial complete | on going | total | percent (complete+partial)/total
-----------------------------------------------------------------------------------------------------------
2021-10-21 | <chr [2]> | <chr[1]> | <chr [1]> | 4 | 50
2021-10-20 | <NULL> | <chr [1]> | <chr [1]> | 2 | 50
看起来我显示的是字符而不是数字计数。我仍在考虑如何添加总计和百分比列。如果你也可以添加它。
我试过的另一种方法:
daily_numbers1 %>%
group_by(`Created On`) %>%
summarize(status = count(status))
我得到了这个:
status | freq
---------------------
complete | 3
partial | 2
on going | 2
远非我所需要的,按日期分组。
在base R
中,这更容易
out <- addmargins(table(daily_numbers1), 2)
cbind(out, percent = rowSums(out[, c('complete', 'partial complete')])/
out[, 'Sum'])
-输出
# complete on going partial complete Sum percent
#2021-10-19 1 0 0 1 1.00
#2021-10-20 0 1 1 2 0.50
#2021-10-21 2 1 1 4 0.75
或使用janitor
library(janitor)
library(dplyr)
daily_numbers1 %>%
tabyl(createdOn, status) %>%
adorn_totals("col") %>%
mutate(Perc = (complete + `partial complete`)/Total)
# createdOn complete on going partial complete Total Perc
# 2021-10-19 1 0 0 1 1.00
# 2021-10-20 0 1 1 2 0.50
# 2021-10-21 2 1 1 4 0.75
或者可以使用 pivot_wider
和 adorn_totals
library(tidyr)
daily_numbers1 %>%
pivot_wider(names_from = status, values_from = status,
values_fn = length, values_fill = 0) %>%
adorn_totals('col') %>%
mutate(Perc = (complete + `partial complete`)/Total)
数据
daily_numbers1 <- structure(list(createdOn = c("2021-10-21", "2021-10-21",
"2021-10-21",
"2021-10-21", "2021-10-20", "2021-10-20", "2021-10-19"), status = c("complete",
"complete", "partial complete", "on going", "partial complete",
"on going", "complete")), class = "data.frame", row.names = c(NA,
-7L))