在数据框中排序分类变量

Ordering categorical variables in a dataframe

如何更改因素在数据框中的显示顺序?

使用澳大利亚州名样本的示例数据:

location <- c("new_south_wales", "victoria", "queensland")

说我想让victoria最后出现!

#this doesn't work
factor(location, levels = c("new_south_wales", "queensland", "victoria")

#neither does this
ordered(location, levels = c("new_south_wales", "queensland", "victoria")

也尝试了 forcats::fct_relevel 但是,虽然我可以更改级别,但它仍然不会影响因素的显示顺序。

如果您希望实际因素按字母数字顺序排序,您可以这样排序。

location <- c("new_south_wales", "victoria", "queensland")
factor(sort(location))
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

当然,您可以在创建它之前或之后执行此操作。

states <- factor(location)
states
# [1] new_south_wales victoria        queensland     
# Levels: new_south_wales queensland victoria

sort(states)
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

ordered_states <- sort(states)
ordered_states
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

您也可以按其他顺序订购它们:

states <- factor(location[c(3, 2, 1])
states
# [1] queensland      victoria        new_south_wales
# Levels: new_south_wales queensland victoria

# Or after the fact:
states <- factor(states[c(3, 1, 2])
states
# [1] victoria        queensland      new_south_wales
# Levels: new_south_wales queensland victoria
# Notice that this reorders the reordered states, because that's how
# states was last assigned.

级别默认按字母数字排序,但这对因子中值的实际顺序没有影响(如您所演示)。

正如您所展示的,有序因子不一定按顺序显示。这只是意味着值是序数