有条件地减去数据框中的单元格

Conditionally subtract cells in data frame

假设我有以下数据框

df1 <- data.frame(cbind("Method" = c("A", "A", "A", "A", 
                                    "B", "B", "B", "B", 
                                    "C", "C", "C", "C"),
                       "Sub" = c(rep(1:4, 2), c(1, 2, 4, 3)),
                       "Value1" = c(1, 2, 3, 4, 0, 0, 0, 0, 1, 2, 3, 4),
                       "Value2" = c(-1, -2, -3, -4, 0, 0, 0, 0, -1, -2, -3, -4),
                       "Value3" = 1:12))
    Method Sub Value1 Value2 Value3
1       A   1      1     -1      1
2       A   2      2     -2      2
3       A   3      3     -3      3
4       A   4      4     -4      4
5       B   1      0      0      5
6       B   2      0      0      6
7       B   3      0      0      7
8       B   4      0      0      8
9       C   1      1     -1      9
10      C   2      2     -2     10
11      C   4      3     -3     11
12      C   3      4     -4     12

我想通过减去 Method == A 观察到的值来改变 Value1Value2。在这种情况下,所需的输出将是

  Method Sub Value1 Value2 Value3
1       A   1      0      0      1
2       A   2      0      0      2
3       A   3      0      0      3
4       A   4      0      0      4
5       B   1     -1      1      5
6       B   2     -2      2      6
7       B   3     -3      3      7
8       B   4     -4      4      8
9       C   1      0      0      9
10      C   2      0      0     10
11      C   4     -1      1     11
12      C   3      1     -1     12

基本上,它看起来像是从 df1[5:8, 3:4] 和 df1[9:12, 3:4] 中减去 df1[1:4, 3:4],除了行必须与 Sub 匹配(参见 Method == C 中 Sub 的顺序)。关于如何有效地实现这一目标的任何帮助?

使用 dplyr 你可以做到

library(dplyr)
df1 %>% 
  group_by(Sub) %>% 
  mutate(across(Value1:Value2, ~.x-.x[Method=="A"]))

#    Method   Sub Value1 Value2 Value3
#    <chr>  <dbl>  <dbl>  <dbl>  <int>
#  1 A          1      0      0      1
#  2 A          2      0      0      2
#  3 A          3      0      0      3
#  4 A          4      0      0      4
#  5 B          1     -1      1      5
#  6 B          2     -2      2      6
#  7 B          3     -3      3      7
#  8 B          4     -4      4      8
#  9 C          1      0      0      9
# 10 C          2      0      0     10
# 11 C          4     -1      1     11
# 12 C          3      1     -1     12

这会为每个 Sub 创建一个组,然后您可以提取每个 Sub 组中 Method=="A" 值的值。

你必须使用ifelse功能。它的工作原理与 Excel

中的 if 函数一样
library(dplyr)
df2 <- df1 %>%
  mutate(Value4 = ifelse(Method == "A", Value1 - Value2 , NA))

作为旁注,您可以更轻松地构建 DataFrame:

df1 <- data.frame(
  Method = c("A", "A", "A", "A", 
              "B", "B", "B", "B", 
              "C", "C", "C", "C"),
  Sub = c(rep(1:4, 2), c(1, 2, 4, 3)),
  Value1 = c(1, 2, 3, 4, 0, 0, 0, 0, 1, 2, 3, 4),
  Value2 = c(-1, -2, -3, -4, 0, 0, 0, 0, -1, -2, -3, -4),
  Value3 = c(1:12)
             )