使用 Count 在 R 中进行数据透视

pivottable in R using Count

我有这个数据集:

created On | status   
--------------------- 
2021-10-21 | complete 
2021-10-21 | complete
2021-10-21 | partial complete
2021-10-21 | on going
2021-10-20 | partial complete
2021-10-20 | on going
2021-10-19 | complete

我正在尝试创建一个像 excel 中那样的枢轴 table,预期输出如下所示:

created on     |   complete     |     partial complete     |     on going     |     total     |     percent (complete+partial)/total
-----------------------------------------------------------------------------------------------------------
2021-10-21     |   2            |     1                    |     1            |     4         |    50
2021-10-20     |   0            |     1                    |     1            |     2         |    50
2021-10-19     |   1            |     0                    |     0            |     1         |    100 

当我尝试以下操作时:

daily_numbers1 %>%
  pivot_table(
    .rows =  ~ `created on`,
    .columns = ~ status,
    .values = ~ status
  )

我得到了以下输出:

created on     |   complete     |     partial complete     |     on going     |     total     |     percent (complete+partial)/total
-----------------------------------------------------------------------------------------------------------
2021-10-21     |   <chr [2]>    |     <chr[1]>             |     <chr [1]>    |     4         |    50
2021-10-20     |   <NULL>       |     <chr [1]>            |     <chr [1]>    |     2         |    50

看起来我显示的是字符而不是数字计数。我仍在考虑如何添加总计和百分比列。如果你也可以添加它。

我试过的另一种方法:

  daily_numbers1 %>% 
    group_by(`Created On`) %>%
    summarize(status = count(status)) 

我得到了这个:

status     |     freq
---------------------
complete   |     3
partial    |     2
on going   |     2

远非我所需要的,按日期分组。

base R中,这更容易

out <- addmargins(table(daily_numbers1), 2)
cbind(out, percent = rowSums(out[, c('complete', 'partial complete')])/
           out[, 'Sum'])

-输出

#            complete on going partial complete Sum percent
#2021-10-19        1        0                0   1    1.00
#2021-10-20        0        1                1   2    0.50
#2021-10-21        2        1                1   4    0.75

或使用janitor

library(janitor)
library(dplyr)
daily_numbers1 %>%
    tabyl(createdOn, status) %>%
    adorn_totals("col") %>%
    mutate(Perc = (complete + `partial complete`)/Total)
#  createdOn complete on going partial complete Total Perc
# 2021-10-19        1        0                0     1 1.00
# 2021-10-20        0        1                1     2 0.50
# 2021-10-21        2        1                1     4 0.75
 

或者可以使用 pivot_wideradorn_totals

library(tidyr)
daily_numbers1 %>%
   pivot_wider(names_from = status, values_from = status,
        values_fn = length, values_fill = 0) %>% 
   adorn_totals('col') %>% 
   mutate(Perc = (complete + `partial complete`)/Total)

数据

daily_numbers1 <- structure(list(createdOn = c("2021-10-21", "2021-10-21", 
"2021-10-21", 
"2021-10-21", "2021-10-20", "2021-10-20", "2021-10-19"), status = c("complete", 
"complete", "partial complete", "on going", "partial complete", 
"on going", "complete")), class = "data.frame", row.names = c(NA, 
-7L))