如何在 R 中对混合值进行排序

How to do a sort of mixed values in R

我有一个数据框,我想按一列而不是下一列排序(如果可能,使用 tidyverse)。

我检查了以下地址,但解决方案似乎不起作用。

Order a "mixed" vector (numbers with letters)

示例代码:

variable <- c("channel", "channel", "channel", "comp_ded", "comp_ded", "comp_ded")
level <- c("DIR", "EA", "IA", "500", "750", "1000")
df <- as_tibble(cbind(variable, level))

这没有给我我想要的:

df <- df %>% arrange(variable, level)

等级列顺序如下:

variable level channel DIR channel EA channel IA level 1000 level 500 level 750

我需要它们:

variable level channel DIR channel EA channel IA level 500 level 750 level 1000

真实数据集中有多个不同的"variables",其中一半需要按数字顺序排序,一半需要按字母顺序排序。有人知道怎么做吗?

有点难看,但您可以使用过滤语句将数据框分成两部分,分别排列每个部分,然后将它们重新绑定在一起:

df <- bind_rows(df %>%
              filter(!is.na(as.numeric(level))) %>%
              arrange(variable, as.numeric(level)),
          df %>%
              filter(is.na(as.numeric(level))) %>%
              arrange(variable, level))

给你:

# A tibble: 6 x 2
  variable level
  <chr>    <chr>
1 comp_ded 500  
2 comp_ded 750  
3 comp_ded 1000 
4 channel  DIR  
5 channel  EA   
6 channel  IA   

您可以创建一个用于排序的临时变量。按所需顺序排序后,您还可以通过转换为因子来永久设置顺序(如@Vio 的回答)。也许是这样的:

df = df %>% 
  mutate(tmp = as.numeric(level)) %>% 
  arrange(variable, tmp, level) %>% 
  select(-tmp) %>% 
  mutate(level = factor(level, levels=unique(level)))
  variable level
  <chr>    <fct>
1 channel  DIR  
2 channel  EA   
3 channel  IA   
4 comp_ded 500  
5 comp_ded 750  
6 comp_ded 1000

我认为您也可以通过不显式创建临时变量来缩短此时间,而是在 arrange:

中使用 "anonymous" 变量
df = df %>% 
  arrange(variable, as.numeric(level), level) %>% 
  mutate(level = factor(level, levels=unique(level)))

转换为因子并改变水平。使用 forcats::fct_relevel()

更容易
# Convert to factor
df <- as_tibble(cbind(variable, level)) %>%
  mutate(level = as.factor(level))

# Change order of levels
levels(df$level) = levels(df$level)[match(c("DIR", "EA", "IA", "500", "750", "1000"), levels(df$level))]

df %>% arrange(level)

# A tibble: 6 x 2
  variable  level
     <chr> <fctr>
1 comp_ded    DIR
2 comp_ded     EA
3 comp_ded     IA
4  channel    500
5  channel    750
6  channel   1000

使用gtools,使用mixedorder:

的稍微短一点的解决方案
library(gtools)
sorteddf <- df[with(df, order(variable, mixedorder(level))),]

输出:

  variable level
1 channel  DIR  
2 channel  EA   
3 channel  IA   
4 comp_ded 500  
5 comp_ded 750  
6 comp_ded 1000

最简单的解决方案是使用 dplyr::group_by

library(dplyr)

variable <- c("channel", "channel", "channel", "comp_ded", "comp_ded", "comp_ded")
level <- c("DIR", "EA", "IA", "500", "750", "1000")
df <- as_tibble(cbind(variable, level))

df %>%
  group_by(variable, level) %>%
  arrange()

# A tibble: 6 x 2
  variable  level
     <chr> <fctr>
1 comp_ded    DIR
2 comp_ded     EA
3 comp_ded     IA
4  channel    500
5  channel    750
6  channel   1000

我认为首先按 as.numeric(level) 排序,然后按 level:

排序要容易得多
df %>% arrange(variable, as.numeric(level), level)

给出:

# A tibble: 6 x 2
variable level
<chr>    <chr>
1 channel  DIR
2 channel  EA
3 channel  IA
4 comp_ded 500
5 comp_ded 750
6 comp_ded 1000