基于另一列重新排序的 Forcats 解决方案
Forcats solution for reordering based on another column
好的,我知道 fct_reorder()
允许您根据另一列对因子重新排序,但据我所知,您必须提供一些对第二列进行运算的函数(均值、中位数等)列以了解如何对列进行排序。但是,如果您有另一列按照您希望的方式对列进行排序,那会怎样 leveled/ordered 原样 ?
比如我有一个列ACADEMIC_PERIOD_DESC
,用英文给出学年:"Fall 2019," "Spring 2020,"等,我有一个对应的列,ACADEMIC_PERIOD
,那是对应学年的数字代码:“201940”,“202020”等。这是我要ACADEMIC_PERIOD_DESC
被拉平的栏目。
数据
df <- structure(list(ACADEMIC_PERIOD = c("200810", "200820", "200830",
"200840", "200910", "200920", "200930", "200940", "201010", "201020"
), ACADEMIC_PERIOD_DESC = structure(1:10, .Label = c("J-Term 2008",
"Spring 2008", "Summer 2008", "Fall 2008", "J-Term 2009", "Spring 2009",
"Summer 2009", "Fall 2009", "J-Term 2010", "Spring 2010", "Summer 2010",
"Fall 2010", "J-Term 2011", "Spring 2011", "Summer 2011", "Fall 2011",
"J-Term 2012", "Spring 2012", "Summer 2012", "Fall 2012", "J-Term 2013",
"Spring 2013", "Summer 2013", "Fall 2013", "Spring 2014", "Summer 2014",
"Fall 2014", "J-Term 2015", "Spring 2015", "Summer 2015", "Fall 2015",
"J-Term 2016", "Spring 2016", "Summer 2016", "Fall 2016", "J-Term 2017",
"Spring 2017", "Summer 2017", "Fall 2017", "J-Term 2018", "Spring 2018",
"Summer 2018", "Fall 2018", "J-Term 2019", "Spring 2019", "Summer 2019",
"Fall 2019", "J-Term 2020", "Spring 2020"), class = "factor")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
即使不必要地应用中位数,我是否应该只执行以下操作?
df %>%
mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC, as.integer(ACADEMIC_PERIOD)))
我也知道我可以像这样使用 base R:
df$ACADEMIC_PERIOD_DESC <- reorder(df$ACADEMIC_PERIOD_DESC, df$ACADEMIC_PERIOD)
有没有更优雅的forcats/tidyverse解决方案?我是不是漏掉了什么?
谢谢!
我们可以将 .fun
从默认值 median
更改为 I
,即按原样获取值
library(dplyr)
library(forcats)
df %>%
mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC,
as.integer(ACADEMIC_PERIOD), .fun = I))
好的,我知道 fct_reorder()
允许您根据另一列对因子重新排序,但据我所知,您必须提供一些对第二列进行运算的函数(均值、中位数等)列以了解如何对列进行排序。但是,如果您有另一列按照您希望的方式对列进行排序,那会怎样 leveled/ordered 原样 ?
比如我有一个列ACADEMIC_PERIOD_DESC
,用英文给出学年:"Fall 2019," "Spring 2020,"等,我有一个对应的列,ACADEMIC_PERIOD
,那是对应学年的数字代码:“201940”,“202020”等。这是我要ACADEMIC_PERIOD_DESC
被拉平的栏目。
数据
df <- structure(list(ACADEMIC_PERIOD = c("200810", "200820", "200830",
"200840", "200910", "200920", "200930", "200940", "201010", "201020"
), ACADEMIC_PERIOD_DESC = structure(1:10, .Label = c("J-Term 2008",
"Spring 2008", "Summer 2008", "Fall 2008", "J-Term 2009", "Spring 2009",
"Summer 2009", "Fall 2009", "J-Term 2010", "Spring 2010", "Summer 2010",
"Fall 2010", "J-Term 2011", "Spring 2011", "Summer 2011", "Fall 2011",
"J-Term 2012", "Spring 2012", "Summer 2012", "Fall 2012", "J-Term 2013",
"Spring 2013", "Summer 2013", "Fall 2013", "Spring 2014", "Summer 2014",
"Fall 2014", "J-Term 2015", "Spring 2015", "Summer 2015", "Fall 2015",
"J-Term 2016", "Spring 2016", "Summer 2016", "Fall 2016", "J-Term 2017",
"Spring 2017", "Summer 2017", "Fall 2017", "J-Term 2018", "Spring 2018",
"Summer 2018", "Fall 2018", "J-Term 2019", "Spring 2019", "Summer 2019",
"Fall 2019", "J-Term 2020", "Spring 2020"), class = "factor")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
即使不必要地应用中位数,我是否应该只执行以下操作?
df %>%
mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC, as.integer(ACADEMIC_PERIOD)))
我也知道我可以像这样使用 base R:
df$ACADEMIC_PERIOD_DESC <- reorder(df$ACADEMIC_PERIOD_DESC, df$ACADEMIC_PERIOD)
有没有更优雅的forcats/tidyverse解决方案?我是不是漏掉了什么?
谢谢!
我们可以将 .fun
从默认值 median
更改为 I
,即按原样获取值
library(dplyr)
library(forcats)
df %>%
mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC,
as.integer(ACADEMIC_PERIOD), .fun = I))