展开列并按 r 中的 ID 计数

spread columns and count by ID in r

我有一个名为 Lead_DataSource__c 的因子列。我想将每个因素分散到一列中,然后通过 id 为每一行显示的该因素的计数来填补空白。

这是我的数据框的头部;

head(df)


 Id                 Lead_DataSource__c numberoflead leadduration  lasttouch firsttouch
  <chr>              <chr>                     <int> <drtn>        <chr>     <chr>     
1 0010I000026fxp6QAA NA                            1       NA days NA        NA        
2 0010I000026frM6QAI Walk in                       1   0.0000 days Walk in   Walk in   
3 0010I000026frOQQAY Walk in                       1   0.0000 days Walk in   Walk in   
4 0010I000026frsUQAQ Walk in                       3 243.9656 days Walk in   Facebook  
5 0010I000026frsUQAQ Facebook                      3 243.9656 days Walk in   Facebook  
6 0010I000026frsUQAQ Facebook                      3 243.9656 days Walk in   Facebook  

我需要这个;

            Id lastcreateddateoflead lasttouch firsttouch Facebook Walk.in <NA>
            1 0010I000026frM6QAI                 43575   Walk in    Walk in        0       1    0
            2 0010I000026frOQQAY                 43843   Walk in    Walk in        0       1    0
            3 0010I000026frsUQAQ                 43794   Walk in   Facebook        2       1    0
            4 0010I000026frsUQAQ                 43794   Walk in   Facebook        2       1    0
            5 0010I000026frsUQAQ                 43794   Walk in   Facebook        2       1    0
            6 0010I000026fsBrQAI                 43699  Facebook   Facebook        1       0    0

到目前为止,我已经使用 dplyr 进行了尝试,但是我没有得到上面看到的我想要的东西;

df%>%
group_by(Id,Lead_DataSource__c) %>%
 mutate(numberofleadsource=n()) %>% 
  spread(Lead_DataSource__c,numberofleadsource,fill = 0)

这是我的代码的输出;

             Id lastcreateddateoflead lasttouch firsttouch Facebook Walk.in <NA>
             1 0010I000026frM6QAI                 43575   Walk in    Walk in        0       1    0
             2 0010I000026frOQQAY                 43843   Walk in    Walk in        0       1    0
             3 0010I000026frsUQAQ                 43794   Walk in   Facebook        2       0    0
             4 0010I000026frsUQAQ                 43794   Walk in   Facebook        2       0    0
             5 0010I000026frsUQAQ                 43794   Walk in   Facebook        0       1    0
             6 0010I000026fsBrQAI                 43699  Facebook   Facebook        1       0    0

谁能帮我解决我在这里遗漏的问题?

输入数据:

structure(list(Id = c("0010I000026fxp6QAA", "0010I000026frM6QAI", 
"0010I000026frOQQAY", "0010I000026frsUQAQ", "0010I000026frsUQAQ", 
"0010I000026frsUQAQ"), Lead_DataSource__c = c(NA, "Walk in", 
"Walk in", "Walk in", "Facebook", "Facebook"), numberoflead = c(1L, 
1L, 1L, 3L, 3L, 3L), leadduration = structure(c(NA, 0, 0, 243.9656, 
243.9656, 243.9656), class = "difftime", units = "days"), lasttouch = c(NA, 
"Walk in", "Walk in", "Walk in", "Walk in", "Walk in"), firsttouch = c(NA, 
"Walk in", "Walk in", "Facebook", "Facebook", "Facebook")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

这里我用add_count()统计每个ID/lead源组合出现了多少次,然后pivot_wider()展开。最后一行填充了主元的缺失值。

library(dplyr)
library(tidyr)
       
df %>%
  add_count(Id, Lead_DataSource__c) %>%
  mutate(tmp = 1:nrow(.)) %>%
  pivot_wider(names_from = Lead_DataSource__c, values_from = n) %>%
  select(-tmp) %>%
  group_by(Id) %>%
  mutate_at(c("NA", "Walk in", "Facebook"), ~ifelse(any(!is.na(.)), .[!is.na(.)][1], 0))
# A tibble: 6 x 8
# Groups:   Id [4]
  Id                 numberoflead leadduration  lasttouch firsttouch  `NA` `Walk in` Facebook
  <chr>                     <int> <drtn>        <chr>     <chr>      <dbl>     <dbl>    <dbl>
1 0010I000026fxp6QAA            1       NA days NA        NA             1         0        0
2 0010I000026frM6QAI            1   0.0000 days Walk in   Walk in        0         1        0
3 0010I000026frOQQAY            1   0.0000 days Walk in   Walk in        0         1        0
4 0010I000026frsUQAQ            3 243.9656 days Walk in   Facebook       0         1        2
5 0010I000026frsUQAQ            3 243.9656 days Walk in   Facebook       0         1        2
6 0010I000026frsUQAQ            3 243.9656 days Walk in   Facebook       0         1        2