当数据是基于分组变量的唯一值的组时，根据另一列中值的存在更改一列中的值

Question

当数据是基于分组变量的唯一值的组时，我想根据另一列中值的存在更改一列中的值。

我正在使用 mtcars 数据。我想根据 cyl 的唯一值对这些数据进行分组，然后我想创建一个名为 OD_gears 的变量，它将在与给定关联的列的每个单元格中显示 yes当与给定的唯一 cyl 值关联的 gears 列中的任何单元格中存在值 5 时，唯一 cyl 值，并显示 no不符合这些标准的情况。

在这种情况下，OD_gears 列中的每个单元格都应包含一个值 yes，因为至少有 1 辆车与唯一的 cyl 值相关联5 出现在 gears 列中。

这可以吗？如果是这样，如何？我可以使用 dplyr 包和 group_by() 来帮助完成这项任务吗？

Answer 1

我们可以使用 if(any(...)) 检查组中是否存在任何值。

library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  mutate(OD_gears = if(any(gear == 5)) 'yes' else 'no') %>%
  ungroup

#    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb OD_gears
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>   
# 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4 yes     
# 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4 yes     
# 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1 yes     
# 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1 yes     
# 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2 yes     
# 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1 yes     
# 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4 yes     
# 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2 yes     
# 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2 yes     
#10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4 yes     
# … with 22 more rows

如果有很多这样的条件要检查使用case_when会有帮助-

mtcars %>%
  group_by(cyl) %>%
  mutate(OD_gears = case_when(any(gear == 5)~ 'yes', 
                              TRUE ~ 'no')) %>%
  ungroup

当数据是基于分组变量的唯一值的组时，根据另一列中值的存在更改一列中的值

Change values in one column based on the presence of values in another column, when data are groups based on unique values of grouping variable

grouping

if-statement

r

dplyr