根据 R 中的 3 个不同组计算 CTR(点击率)并输出为新字段

Calculate CTR (click-through-rate) and output as new field based on 3 distinct groups in R

我有一个如下所示的数据框:

                             user_id           split_test_group  Exposed_Date        Exposed_Time                             event_type     platform       event_type_button    Event_Date          Event_Time       
 b861f69cb669766cc0d52bc2279332c0:   414   control     :27274   Min.   :2017-06-29   Length:142123      referrer_page_invite_action:  8896   iOS:135067              :133227   Min.   :2017-06-02   Length:142123     
 5ac762996eb6e8932fd2140e77a7a870:   376   tab_only    :67381   1st Qu.:2017-07-05   Class :character   referrer_page_viewed       :133227   Web:  7056   Link       :  1816   1st Qu.:2017-07-21   Class :character  
 00c26d8255da64576a9e3c5a2c1271eb:   298   tab_settings:47468   Median :2017-07-08   Mode  :character                                                     SMS        :  1727   Median :2017-08-04   Mode  :character  
 2d4fdc1606ad722a93a98882f9ccf331:   236                        Mean   :2017-07-12                                                                        link       :  1226   Mean   :2017-08-02                     
 4296772f2a0ce768d25573863c82968c:   236                        3rd Qu.:2017-07-20                                                                        email      :  1118   3rd Qu.:2017-08-17                     
 fe7a74a142b5774375ba64184d67a8e3:   212                        Max.   :2017-09-01                                                                        ShareDialog:  1062   Max.   :2017-09-01                     
 (Other)                         :140351                                                                                                                  (Other)    :  1947   

我想使用 dplyr 包转置数据帧,这样我就可以计算 referrer_page_viewes 的数量与每个 split_test_group 的 referrer_page_invite_actions 的数量 创建点击率。

有了这个 table 我想根据每个 split_test_group.

计算点击率
event_type                     split_test_group                         Total_Impressions
referrer_page_invite_action          control                            1892        
referrer_page_invite_action          tab_only                           4009        
referrer_page_invite_action          tab_settings                       2995        
referrer_page_viewed                 control                            25382       
referrer_page_viewed                 tab_only                           63372       
referrer_page_viewed                 tab_settings                       44473   

等式:点击率(每个用户 ID)= referrer_page_invite_action # / referrer_page_viewed #

期望的输出 table 看起来像这样(虚构的值):

split_test_group | CTR
control             x%
tab_only            x%
tab_settings        x% 

我们可以使用 pivot_wider 重塑为 'wide' 格式,然后进行除法

library(dplyr)
library(tidyr)
df1 %>% 
     pivot_wider(names_from = event_type, values_from = Total_Impressions) %>% 
     transmute(split_test_group, CTR = 100 *referrer_page_invite_action/
            referrer_page_viewed )

-输出

# A tibble: 3 x 2
  split_test_group   CTR
  <chr>            <dbl>
1 control           7.45
2 tab_only          6.33
3 tab_settings      6.73

数据

df1 <- structure(list(event_type = c("referrer_page_invite_action", 
"referrer_page_invite_action", "referrer_page_invite_action", 
"referrer_page_viewed", "referrer_page_viewed", "referrer_page_viewed"
), split_test_group = c("control", "tab_only", "tab_settings", 
"control", "tab_only", "tab_settings"), Total_Impressions = c(1892L, 
4009L, 2995L, 25382L, 63372L, 44473L)), class = "data.frame", row.names = c(NA, 
-6L))