如果 column1 名称 == first(value) 来自 column2 BY GROUP,则有条件地从 column1 中获取值

Conditionally take value from column1 if the column1 name == first(value) from column2 BY GROUP

我有这个假数据框:

df <- structure(list(Group = c(1L, 1L, 2L, 2L), A = 1:4, B = 5:8, C = 9:12, 
X = c("A", "A", "B", "B")), class = "data.frame", row.names = c(NA, -4L))

  Group A B  C X
1     1 1 5  9 A
2     1 2 6 10 A
3     2 3 7 11 B
4     2 4 8 12 B

我尝试改变一个新列,它应该采用在另一列中具有列名的列的值:

期望输出:

Group   A   B   C   X new_col
1       1   5   9     A 1
1       2   6   10    A 1
2       3   7   11    B 7
2       4   8   12    B 7

我目前的尝试:

library(dplyr)

df %>% 
  group_by(Group) %>% 
  mutate(across(c(A,B,C), ~ifelse(first(X) %in% colnames(.), first(.), .), .names = "new_{.col}"))

  Group     A     B     C X     new_A new_B new_C
  <int> <int> <int> <int> <chr> <int> <int> <int>
1     1     1     5     9 A         1     5     9
2     1     2     6    10 A         1     5     9
3     2     3     7    11 B         3     7    11
4     2     4     8    12 B         3     7    11

一个选项可能是:

df %>%
    rowwise() %>%
    mutate(new_col = get(X)) %>%
    group_by(Group, X) %>%
    mutate(new_col = first(new_col))

 Group     A     B     C X     new_col
  <int> <int> <int> <int> <chr>   <int>
1     1     1     5     9 A           1
2     1     2     6    10 A           1
3     2     3     7    11 B           7
4     2     4     8    12 B           7

使用by并将+ 1添加到select列的组号。假设组列按照 "Group" 列之后的示例排列。

transform(df, new_col=do.call(rbind, by(df, df$Group, \(x) 
                                        cbind(paste(x$X, x[1, x$Group[1] + 1])))))
#   Group A B  C X new_col
# 1     1 1 5  9 A     A 1
# 2     1 2 6 10 A     A 1
# 3     2 3 7 11 B     B 7
# 4     2 4 8 12 B     B 7
          

注: R version 4.1.2 (2021-11-01).


数据:

df <- structure(list(Group = c(1L, 1L, 2L, 2L), A = 1:4, B = 5:8, C = 9:12, 
    X = c("A", "A", "B", "B")), class = "data.frame", row.names = c(NA, 
-4L))

base R中,我们可以使用row/column索引

df$new_col <- df[2:4][cbind(match(unique(df$Group), df$Group)[df$Group], 
       match(df$X, names(df)[2:4]))]
df$new_col
[1] 1 1 7 7