如何在 R 中创建新数据时转置列?

How to transpose columns whilst creating new data in R?

我有一个如下所示的基因数据集:

Pathway       Gene
Pathway1      Gene1
Pathway1      Gene2
Pathway2      Gene3
Pathway2      Gene1
Pathway3      Gene1
Pathway3      Gene4
Pathway3      Gene5

我希望将 Pathways 行转置为列,同时使用 1 和 0 跟踪哪些基因存在于哪个通路中。创建这样的输出:

Gene  Pathway1  Pathway2  Pathway3
Gene1    1           1         1
Gene2    1           0         0
Gene3    0           1         0
Gene4    0           0         1
Gene5    0           0         0

我的真实数据大约有 3000 行长,我对 R 没有信心所以我一直在尝试使用 t() 但我不确定从哪里开始编码以获得二进制计数我我正在寻找 - 任何有关尝试功能的帮助或建议都会有所帮助。

输入示例数据:

structure(list(Pathway = c("Pathway1", "Pathway1", "Pathway2", 
"Pathway2", "Pathway3", "Pathway3", "Pathway3"), Gene = c("Gene1", 
"Gene2", "Gene3", "Gene1", "Gene1", "Gene4", "Gene5")), row.names = c(NA, 
-7L), class = c("data.table", "data.frame"))

快速而肮脏的tidyverse解决方案:

library(tidyr)

# edit thanks to @Ronak Shah
df %>%
pivot_wider(names_from = Pathway,
            values_from = Pathway,
            values_fn = length, values_fill = 0)

# A tibble: 5 x 4
  Gene  Pathway1 Pathway2 Pathway3
  <chr>    <dbl>    <dbl>    <dbl>
1 Gene1        1        1        1
2 Gene2        1        0        0
3 Gene3        0        1        0
4 Gene4        0        0        1
5 Gene5        0        0        1

data.table接近

library(data.table)
dcast(setDT(mydata), Gene ~ Pathway, value.var = "Pathway", fun.aggregate = length)
#     Gene Pathway1 Pathway2 Pathway3
# 1: Gene1        1        1        1
# 2: Gene2        1        0        0
# 3: Gene3        0        1        0
# 4: Gene4        0        0        1
# 5: Gene5        0        0        1

您可以使用 janitor::tabyl.

janitor::tabyl(df, Gene, Pathway)

#  Gene Pathway1 Pathway2 Pathway3
# Gene1        1        1        1
# Gene2        1        0        0
# Gene3        0        1        0
# Gene4        0        0        1
# Gene5        0        0        1