基于列表的排序列,然后对数据框中的另一列进行排序
Ordering column based on a list followed by sorting another column in a data frame
我有以下数据框:
tdf <- structure(list(GO = c("Cytokine-cytokine receptor interaction",
"Cytokine-cytokine receptor interaction|Endocytosis", "I-kappaB kinase/NF-kappaB signaling",
"NF-kappa B signaling pathway", "NF-kappaB import into nucleus",
"T cell chemotaxis"), PosCount = c(17, 18, 4, 5, 1, 2), shortgo = structure(c(1L,
1L, 2L, 2L, 2L, 3L), .Label = c("z", "X", "y"), class = "factor")), .Names = c("GO",
"PosCount", "shortgo"), row.names = c(NA, 6L), class = "data.frame")
desired_order <- c("y", "X", "z")
看起来像这样:
GO PosCount shortgo
1 Cytokine-cytokine receptor interaction 17 z
2 Cytokine-cytokine receptor interaction|Endocytosis 18 z
3 I-kappaB kinase/NF-kappaB signaling 4 X
4 NF-kappa B signaling pathway 5 X
5 NF-kappaB import into nucleus 1 X
6 T cell chemotaxis 2 y
然后我想做的是用预定义列表
订购shortgo
desired_order <- c("y", "X", "z")
然后对于每个 shortgo
组在内部按 PosCount
排序。产生这个:
GO PosCount shortgo
T cell chemotaxis 2 y
NF-kappa B signaling pathway 5 X
I-kappaB kinase/NF-kappaB signaling 4 X
NF-kappaB import into nucleus 1 X
Cytokine-cytokine receptor interaction|Endocytosis 18 z
Cytokine-cytokine receptor interaction 17 z
我试过了但失败了:
library(dplyr)
#tdf %>% arrange(as.character(shortgo), desc(PosCount))
tdf %>% arrange(desired_order, desc(PosCount))
正确的做法是什么?
使用变量的 factor
表示来强加您想要的 order
:
在dplyr
中,只需执行:
tdf %>% arrange(factor(shortgo,levels=desired_order), desc(PosCount) )
在基础 R 中,只需使用:
tdf[order(factor(tdf$shortgo,levels=desired_order), -tdf$PosCount),]
我有以下数据框:
tdf <- structure(list(GO = c("Cytokine-cytokine receptor interaction",
"Cytokine-cytokine receptor interaction|Endocytosis", "I-kappaB kinase/NF-kappaB signaling",
"NF-kappa B signaling pathway", "NF-kappaB import into nucleus",
"T cell chemotaxis"), PosCount = c(17, 18, 4, 5, 1, 2), shortgo = structure(c(1L,
1L, 2L, 2L, 2L, 3L), .Label = c("z", "X", "y"), class = "factor")), .Names = c("GO",
"PosCount", "shortgo"), row.names = c(NA, 6L), class = "data.frame")
desired_order <- c("y", "X", "z")
看起来像这样:
GO PosCount shortgo
1 Cytokine-cytokine receptor interaction 17 z
2 Cytokine-cytokine receptor interaction|Endocytosis 18 z
3 I-kappaB kinase/NF-kappaB signaling 4 X
4 NF-kappa B signaling pathway 5 X
5 NF-kappaB import into nucleus 1 X
6 T cell chemotaxis 2 y
然后我想做的是用预定义列表
订购shortgo
desired_order <- c("y", "X", "z")
然后对于每个 shortgo
组在内部按 PosCount
排序。产生这个:
GO PosCount shortgo
T cell chemotaxis 2 y
NF-kappa B signaling pathway 5 X
I-kappaB kinase/NF-kappaB signaling 4 X
NF-kappaB import into nucleus 1 X
Cytokine-cytokine receptor interaction|Endocytosis 18 z
Cytokine-cytokine receptor interaction 17 z
我试过了但失败了:
library(dplyr)
#tdf %>% arrange(as.character(shortgo), desc(PosCount))
tdf %>% arrange(desired_order, desc(PosCount))
正确的做法是什么?
使用变量的 factor
表示来强加您想要的 order
:
在dplyr
中,只需执行:
tdf %>% arrange(factor(shortgo,levels=desired_order), desc(PosCount) )
在基础 R 中,只需使用:
tdf[order(factor(tdf$shortgo,levels=desired_order), -tdf$PosCount),]