具有多种组合的 DF 变换

DF transform with multiple combinations

我是 R 的初学者

如何将DF转换成这样?

我试图让 DF 包含两个因素组合的计数。

当条件如下;每个id中Consult_A == 1 & Reply_A == 1,则计数为“1”。在这个转换中,我想得到咨询和回复项目之间的连接流。

# original DF
df= data.frame(
          id = c(1L, 2L),
   Consult_A = c(1L, 1L),
   Consult_B = c(1L, 0L),
   Consult_C = c(1L, 0L),
     Reply_A = c(1L, 1L),
     Reply_B = c(0L, 0L),
     Reply_C = c(1L, 1L)
)


# answer DF (I want to get every combination of Consult and Reply)
ans_omit = data.frame(
           Consult = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
             Reply = c("A", "B", "C", "A", "B", "C", "A", "B", "C"),
             Count = c(2L, 0L, 2L, 1L, 0L, 1L, 1L, 0L, 1L)
)

我认为采用更整洁的格式可能更容易管理。首先,您可以使用 pivot_longer 放入长格式,并删除那些为零的因素:

library(tidyverse)

df_long <- df %>%
  pivot_longer(cols = -id, names_to = c("var", "factor"), names_sep = "_") %>%
  filter(value == 1) %>%
  select(-value)

df_long

     id var     factor
  <int> <chr>   <chr> 
1     1 Consult A     
2     1 Consult B     
3     1 Consult C     
4     1 Reply   A     
5     1 Reply   C     
6     2 Consult A     
7     2 Reply   A     
8     2 Reply   C

然后,您可以在“咨询”和“回复”之间执行 full_join 以获得两者之间的组合。最后,计算不同的 id 以获得所需的 Count 列,并使用 complete 添加计数为零的组合。

full_join(
  df_long %>% filter(var == "Consult") %>% rename(Consult = factor),
  df_long %>% filter(var == "Reply") %>% rename(Reply = factor),
  by = "id"
) %>%
  group_by(Consult, Reply) %>%
  summarise(Count = n_distinct(id)) %>%
  ungroup() %>%
  complete(Consult, Reply = unique(Consult), fill = list(Count = 0))

输出

  Consult Reply Count
  <chr>   <chr> <dbl>
1 A       A         2
2 A       B         0
3 A       C         2
4 B       A         1
5 B       B         0
6 B       C         1
7 C       A         1
8 C       B         0
9 C       C         1