更改 start_with R 中相同模式的多个因素的水平

Question

我有一个包含数千个观察值的长数据框，但出于演示目的，我展示了这个

df <- data.frame(col1=rep(c("A","B","C"),3),
                 col2=c("10.01","10.02","10.03","100.1","100.2","100.3","12.1","12.2","12.3"))

 col1  col2
1    A 10.01
2    B 10.02
3    C 10.03
4    A 100.1
5    B 100.2
6    C 100.3
7    A  12.1
8    B  12.2
9    C  12.3

df <- df %>% 
  mutate_all(., as.factor)

levels(df$col2)
"10.01" "10.02" "10.03" "100.1" "100.2" "100.3" "12.1"  "12.2"  "12.3"

我想改变 col2 中级别的顺序，像这样

"10.01" "10.02" "10.03" "12.1"  "12.2"  "12.3" "100.1" "100.2" "100.3"

非常感谢任何帮助或评论

Answer 1

使用forcats::fct_inseq:

df <- df %>% 
  mutate_all(., as.factor) %>% 
  mutate(col2 = fct_inseq(df$col2))

输出：

levels(df$col2)
[1] "10.01" "10.02" "10.03" "12.1"  "12.2"  "12.3"  "100.1" "100.2" "100.3"

Answer 2

您可以使用 gtools::mixedsort -

library(dplyr)

df <- df %>%  mutate(across(.fns = ~factor(., gtools::mixedsort(unique(.)))))

str(df)
#'data.frame':  9 obs. of  2 variables:
# $ col1: Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3
# $ col2: Factor w/ 9 levels "10.01","10.02",..: 1 2 3 7 8 9 4 5 6

sapply(df, levels)
#$col1
#[1] "A" "B" "C"

#$col2
#[1] "10.01" "10.02" "10.03" "12.1"  "12.2"  "12.3"  "100.1" "100.2" "100.3"

更改 start_with R 中相同模式的多个因素的水平

Change the levels of multiple factors that start_with the same pattern in R

datatable

r

dplyr

tidyverse

forcats