如果行中的变量在 R 中匹配或不匹配，如何将 1 和 0 分配给列

Question

我是编码和 R 的绝对初学者，这是我为一个项目做这件事的第三周。（对于生物学家，我试图找到 PRS 的风险等位基因的总和）但我需要这部分的帮助

df
  x y z
1 t c a
2 a t a
3 g g t

所以当应用代码时：

  x y z
1 t 0 0
2 a 0 1
3 g 1 0
```

I'm trying to make it that if the rows in y or z match x the value changes to 1 and if not, zero
I started with: 
```
for(i in 1:ncol(df)){
  df[, i]<-df[df$x == df[,i], df[ ,i]<- 1]
}
```
But got all NA values 
In reality, I have 100 columns I have to compare with x in the data frame. Any help is appreciated

Answer 1

一种tidyverse方法

library(dplyr)

df <-
  tibble(
    x = c("t","a","g"),
    y = c("c","t","g"),
    z = c("a","a","t")
  )

df %>% 
  mutate(
    across(
      .cols = c(y,z),
      .fns = ~if_else(. == x,1,0) 
    )
  )

# A tibble: 3 x 3
  x         y     z
  <chr> <dbl> <dbl>
1 t         0     0
2 a         0     1
3 g         1     0

Answer 2

另一种方法是在 base R 中使用 ifelse()。

df$y <- ifelse(df$y == df$x, 1, 0)
df$z <- ifelse(df$z == df$x, 1, 0)
df
#  x y z
#1 t 0 0
#2 a 0 1
#3 g 1 0

编辑以有效地将此步骤扩展到所有列

例如：

df1
#  x y z w
#1 t c a t
#2 a t a a
#3 g g t m

要有效地应用列编辑，更好的方法是使用应用于数据框中所有目标列的函数。这是一个简单的函数来完成这项工作：

edit_col <- function(any_col) any_col <- ifelse(any_col == df1$x, 1, 0)

此函数取一列，然后将列中的元素与df1$x的元素进行比较，然后相应地编辑该列。此函数采用单列。要将其应用于所有目标列，您可以使用 apply()。因为在您的情况下 x 不是目标列，所以您需要通过索引 [-1] 来排除它，因为它是 df.

中的第一列

# Here number 2 indicates columns. Use number 1 for rows.

df1[, -1] <- apply(df1[,-1], 2, edit_col)
df1
#  x y z w
#1 t 0 0 1
#2 a 0 1 1
#3 g 1 0 0

当然你也可以定义一个编辑数据框的函数，这样你就不需要手动apply()。

这是一个这样的函数的例子

edit_df <- function(any_df){
    edit_col <- function(any_col) any_col <- ifelse(any_col == any_df$x, 1, 0)
    
    # Create a vector containing all names of the targeted columns.
    
    target_col_names <- setdiff(colnames(any_df), "x")
    
    any_df[,target_col_names] <-apply( any_df[,target_col_names], 2, edit_col)
    return(any_df)
}

然后使用函数：

edit_df(df1)
#  x y z w
#1 t 0 0 1
#2 a 0 1 1
#3 g 1 0 0

如果行中的变量在 R 中匹配或不匹配，如何将 1 和 0 分配给列

How to assign 1s and 0s to columns if variable in row matches or not match in R

r

multiple-columns