group_by 并根据大写行观察结果填充特定行

group_by and fill specific rows based on capitalised row observations

我有一些数据如下所示:

# A tibble: 10 × 4
   RegionName `Año 2004_1` `Año 2004_2` `Año 2004_3`
   <chr>             <dbl>        <dbl>        <dbl>
 1 ANDALUCÍA            NA           NA           NA
 2 Almería              NA           NA           NA
 3 Abla                 58           61           54
 4 Abrucena              6            2            1
 5 Adra                146          211          101
 6 ALBÁNCHEZ            12            3            3
 7 Alboloduy             2            2            2
 8 Albox                33           66           35
 9 ALCOLEA               0            1            1
10 Alcóntar              1            1            2

我想做的是 group_by 并计算每个大写 RegionNamesum。即 mutate(across(where(is.numeric)... 然后在每个大写区域旁边添加值。

例如:

使用这个 post 我可以提取大写的单词然后存储在一个新的列中使用:

data %>% 
  group_by(grp = cumsum(RegionName == toupper(RegionName))) %>%
  mutate(REGIONNAME = first(RegionName)) %>% 
  relocate(REGIONNAME, .before = RegionName)

所以数据看起来像:

# A tibble: 10 × 6
# Groups:   grp [3]
   REGIONNAME RegionName `Año 2004_1` `Año 2004_2` `Año 2004_3`   grp
   <chr>      <chr>             <dbl>        <dbl>        <dbl> <int>
 1 ANDALUCÍA  ANDALUCÍA            NA           NA           NA     1
 2 ANDALUCÍA  Almería              NA           NA           NA     1
 3 ANDALUCÍA  Abla                 58           61           54     1
 4 ANDALUCÍA  Abrucena              6            2            1     1
 5 ANDALUCÍA  Adra                146          211          101     1
 6 ALBÁNCHEZ  ALBÁNCHEZ            12            3            3     2
 7 ALBÁNCHEZ  Alboloduy             2            2            2     2
 8 ALBÁNCHEZ  Albox                33           66           35     2
 9 ALCOLEA    ALCOLEA               0            1            1     3
10 ALCOLEA    Alcóntar              1            1            2     3

**忽略 grp 列,我想 group_by(REGIONNAME)mutate(across... Año... 列以便给我一个 sum每个REGIONNAME。然后我想在每一列下填写 NA 值。

预期输出(在 ***x*** 旁边进行修改):

   REGIONNAME RegionName `Año 2004_1` `Año 2004_2` `Año 2004_3`   grp
   <chr>      <chr>             <dbl>        <dbl>        <dbl> <int>
 1 ANDALUCÍA  ANDALUCÍA        ***212***     ***274***   ***155***  1
 2 ANDALUCÍA  Almería              NA           NA           NA     1
 3 ANDALUCÍA  Abla                 58           61           54     1
 4 ANDALUCÍA  Abrucena              6            2            1     1
 5 ANDALUCÍA  Adra                146          211          101     1
 6 ALBÁNCHEZ  ALBÁNCHEZ        ***35***      ***68***     ***37***  2
 7 ALBÁNCHEZ  Alboloduy             2            2            2     2
 8 ALBÁNCHEZ  Albox                33           66           35     2
 9 ALCOLEA    ALCOLEA          ***1***        ***1***     ***2***   3
10 ALCOLEA    Alcóntar              1            1            2     3

数据:

data <- structure(list(RegionName = c("ANDALUCÍA", "Almería", "Abla", 
"Abrucena", "Adra", "ALBÁNCHEZ", "Alboloduy", "Albox", "ALCOLEA", 
"Alcóntar"), `Año 2004_1` = c(NA, NA, 58, 6, 146, 12, 2, 33, 
0, 1), `Año 2004_2` = c(NA, NA, 61, 2, 211, 3, 2, 66, 1, 1), 
    `Año 2004_3` = c(NA, NA, 54, 1, 101, 3, 2, 35, 1, 2)), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

您可以用每个组的 non-capitalized 行的总和替换每个大写的行:

#Data
data %>% 
  group_by(grp = cumsum(RegionName == toupper(RegionName))) %>%
  mutate(REGIONNAME = first(RegionName)) %>% 
  relocate(REGIONNAME, .before = RegionName) %>% 
  
  # Here
  mutate(across(starts_with("Año"), 
                ~ ifelse(REGIONNAME == RegionName, sum(.x[REGIONNAME != RegionName], na.rm = T), .x)))

# A tibble: 10 x 6
# Groups:   grp [3]
   REGIONNAME RegionName `Año 2004_1` `Año 2004_2` `Año 2004_3`   grp
   <chr>      <chr>             <dbl>        <dbl>        <dbl> <int>
 1 ANDALUCÍA  ANDALUCÍA           210          274          156     1
 2 ANDALUCÍA  Almería              NA           NA           NA     1
 3 ANDALUCÍA  Abla                 58           61           54     1
 4 ANDALUCÍA  Abrucena              6            2            1     1
 5 ANDALUCÍA  Adra                146          211          101     1
 6 ALBÁNCHEZ  ALBÁNCHEZ            35           68           37     2
 7 ALBÁNCHEZ  Alboloduy             2            2            2     2
 8 ALBÁNCHEZ  Albox                33           66           35     2
 9 ALCOLEA    ALCOLEA               1            1            2     3
10 ALCOLEA    Alcóntar              1            1            2     3