在 R 中的列中创建新的计算类别

Creating A New Calculated Category Within A Column in R

假设我有一个类似于此的数据框,只有 1000 个观察值:

df <- data.frame(Group = c('A', 'A', 'A', 'B', 'B',
                           'B','B','C','C','C','D','D','D','D','D'),
                 Values=c('5','7','9','0','8','4','5','2','1','3','6','3','1','3','5'))

我想要做的是根据数据框中已存在的组中的值向数据框中添加一个新的计算组 ,而不替换 原始组的值。例如,假设我想保留 D 组,但创建一个新组,其中 D 组的所有值 +2。

我想要的结果数据框示例如下:

df <- data.frame(Group = c('A', 'A', 'A', 'B', 'B',
                           'B','B','C','C','C','D','D','D','D','D'
                           ,'Dadjusted','Dadjusted','Dadjusted','Dadjusted','Dadjusted'),
                 Values=c('5','7','9','0','8','4','5','2','1','3','6','3','1','3','5',
                          '8','5','3','5','7'))

我试过使用如下 ifelse 语句:

   df$adjustedvalues<-ifelse(Group=='D', df$Values+2, df$Values)

但这种方法产生的数据帧如下所示:

df <- data.frame(Group = c('A', 'A', 'A', 'B', 'B',
                           'B','B','C','C','C','D','D','D','D','D'),
                 Values=c('5','7','9','0','8','4','5','2','1','3','6','3','1','3','5')
                 adjustedvalues=c('5','7','9','0','8','4','5','2','1','3','8','5','3','5','7')

这对我的目的来说不太理想。

您可以使用 bind_rows

library(tidyverse)

df %>% 
  bind_rows(df %>% 
            filter(Group == "D") %>%
            mutate(Values = as.character(as.numeric(Values) + 2),
                   Group = "Dadjusted"))
#>        Group Values
#> 1          A      5
#> 2          A      7
#> 3          A      9
#> 4          B      0
#> 5          B      8
#> 6          B      4
#> 7          B      5
#> 8          C      2
#> 9          C      1
#> 10         C      3
#> 11         D      6
#> 12         D      3
#> 13         D      1
#> 14         D      3
#> 15         D      5
#> 16 Dadjusted      8
#> 17 Dadjusted      5
#> 18 Dadjusted      3
#> 19 Dadjusted      5
#> 20 Dadjusted      7

reprex package (v2.0.1)

于 2022-04-26 创建

这是一个可能的基础 R 选项:

rbind(df, data.frame(Group = "Dadjusted", 
                     Values = as.integer(df$Values)[df$Group == "D"]+2))

输出

       Group Values
1          A      5
2          A      7
3          A      9
4          B      0
5          B      8
6          B      4
7          B      5
8          C      2
9          C      1
10         C      3
11         D      6
12         D      3
13         D      1
14         D      3
15         D      5
16 Dadjusted      8
17 Dadjusted      5
18 Dadjusted      3
19 Dadjusted      5
20 Dadjusted      7

更新: 还有一个 dplyr 解决方案,类似于@Allan Cameron 的解决方案,但不那么优雅:

library(dplyr)

df %>% 
  type.convert(as.is=TRUE) %>% 
  filter(Group=="D") %>% 
  mutate(Group = "Dadjusted",
         Values = Values + 2) %>% 
  bind_rows(df %>% 
              type.convert(as.is = TRUE)) %>% 
  arrange(Group)
         Group Values
1          A      5
2          A      7
3          A      9
4          B      0
5          B      8
6          B      4
7          B      5
8          C      2
9          C      1
10         C      3
11         D      6
12         D      3
13         D      1
14         D      3
15         D      5
16 Dadjusted      8
17 Dadjusted      5
18 Dadjusted      3
19 Dadjusted      5
20 Dadjusted      7