将一个 Dataframe 的值除以另一个 Dataframe 的值

Divide Value from one Dataframe by Value of another Dataframe

这是我的两个数据框:

    structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
"Aaron P. Graft", "Aaron P. Graft", "Aaron P. Jagdfeld"), year = c(2019, 
2020, 2019, 2020, 2019), counter = c(5541L, 3269L, 165L, 200L, 
4L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L), groups = structure(list(Full.Name = c("A. Patrick Beharelle", 
"Aaron P. Graft", "Aaron P. Jagdfeld"), .rows = structure(list(
    1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))

structure(list(authority_dic = c("accomplished", "accomplished", 
"accomplished", "accomplished", "accomplished"), Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
"Aaron P. Graft", "Aaron P. Graft", "Aaron P. Jagdfeld"), Entity = c("WERNER ENTERPRISES INC", "MONDELEZ INTERNATIONAL INC", 
"AEROJET ROCKETDYNE HOLDINGS", "T-MOBILE US INC", "SOUTHWEST AIRLINES"
), `2019` = c(1L, 0L, 1L, 0L, 0L), `2020` = c(0L, 1L, 0L, 3L, 
1L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L), groups = structure(list(authority_dic = c("accomplished", 
"accomplished", "accomplished", "accomplished", "accomplished"
), Full.Name = c("Derek J. Leathers", "Dirk Van de Put", "Eileen P. Drake", 
"G. Michael Sievert", "Gary C. Kelly"), .rows = structure(list(
    1L, 2L, 3L, 4L, 5L), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L), .drop = TRUE))

现在,我想将“2019”列的每个值除以另一个数据框的“计数器”值,并将结果添加为另一列。复杂性就位了,因为我只想除以 2019 年和(例如)Aaron P. Graft 的“计数器”值。 我想对包含名称“Aaron P. Graft”的数据框的每一行执行此操作,因此从该行中包含“Aaron P. Graft”的其他数据框中获取“counter”的值。

我自己想不通。也许我需要转置第一个数据框中的年份和计数器列,但我不知道。

这就是我想要实现的目标:

authority_dic Full.name 2019 2020 2019_freq 2020_freq
example word Aaron P. Jagdfeld 10 20 10/counter(of 2019) 20/counter(of 2020)

如果有任何问题,请不要介意问我。 提前致谢!!!

让结构为 s1s2,这应该可行:

library(tidyr)
mutate(
      full_join(
         summarise(
            group_by(s2, authority_dic, Full.Name),
            `2019`=sum(`2019`),
            `2020`=sum(`2020`)),
         s1 %>% spread(year,counter),
         by=c("Full.Name")),
      `2019_freq`=`2019.x`/`2019.y`,
      `2020_freq`=`2020.x`/`2020.y`)
# A tibble: 3 × 8
# Groups:   authority_dic [1]
  authority_dic Full.Name            `2019.x` `2020.x` `2019.y` `2020.y` `2019_freq` `2020_freq`
  <chr>         <chr>                   <int>    <int>    <int>    <int>       <dbl>       <dbl>
1 accomplished  A. Patrick Beharelle        1        1     5541     3269    0.000180    0.000306
2 accomplished  Aaron P. Graft              1        3      165      200    0.00606     0.015   
3 accomplished  Aaron P. Jagdfeld           0        1        4       NA    0          NA       

好的做法是避免使用值命名列,例如2019.... 请改用 year。您的模型需要重构为正常形式(有关详细信息,请参阅数据库规范化主题)。