当您有多个不属于一列的变量时,如何指定 ggplot 图例顺序?
How to specify ggplot legend order when you have multiple variables that are not all part of one column?
我使用 ggplot 按不同的时间尺度(周、月、季度等)绘制相同的数据,因此,我从不同的列中提取数据。但是,当我看到我的图例时,我希望它是一个特定的顺序。
我知道如果所有分组变量都在一列中,我可以将其设置为有序因子,因为它解释了 here, but my data are spread across multiple columns. I also tried the suggestions 关于重新排序多个 geom,但它没有用。
因为我的实际数据集非常复杂,所以我复制了一个只有周和月数据的较小版本。对于最终答案,请允许它指定一个特定的顺序,而不是像 rev()
这样的东西,因为在我的实际数据集中,我有 6 列需要一个特定的顺序。
这是要重现的代码——为此,前 3 个块构成了数据集,因此只有第 4 个块构成了绘图应该与实际解决方案相关。 R 显示顺序的默认设置是在图例中首先显示 'Score - Month',所以我想看看如何将其设为第二个。
library(dplyr)
library(ggplot2)
library(lubridate)
#Generates week data -- shouldn't be relevant to troubleshoot
by_week <- tibble(Week = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="weeks"),
Week_score = c(sample(100:200, 79)),
Month = ymd(format(Week, "%Y-%m-01")))
#Generates month data -- shouldn't be relevant to troubleshoot
by_month <- tibble(Month = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="months"),
Month_score = c(sample(150:200, 19)))
#Joins data and removes duplications of month data for easier plotting -- shouldn't be relevant to troubleshoot
all_time <- by_week %>%
full_join(by_month) %>%
mutate(helper = across(c(contains("Month")), ~paste(.))) %>%
mutate(across(c(contains("Month")), ~ifelse(duplicated(helper), NA, .)), .keep="unused") %>%
mutate(Month = as.Date(Month))
#Makes plot - this is where I want the order in the legend to be different
all_time %>%
ggplot(aes(x = Week)) +
geom_line(aes(y= Week_score, colour = "Week_score")) +
geom_line(data=all_time[!is.na(all_time$Month_score),], aes(y = Month_score, colour = "Month_score")) + #This line tells R just to focus on non-missing values for Month_score
scale_colour_discrete(labels = c("Week_score" = "Score - Week", "Month_score" = "Score - Month"))
这是当前图例的样子 -- 我希望使用可扩展到 2 个以上选项的解决方案切换顺序。谢谢!
正如@stefan 在评论中提到的那样,您应该在 scale_colour_discrete
的 limits
选项中设置标签的名称。您可以自己添加更多列。您可以使用以下代码:
library(dplyr)
library(ggplot2)
library(lubridate)
#Generates week data -- shouldn't be relevant to troubleshoot
by_week <- tibble(Week = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="weeks"),
Week_score = c(sample(100:200, 79)),
Month = ymd(format(Week, "%Y-%m-01")))
#Generates month data -- shouldn't be relevant to troubleshoot
by_month <- tibble(Month = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="months"),
Month_score = c(sample(150:200, 19)))
#Joins data and removes duplications of month data for easier plotting -- shouldn't be relevant to troubleshoot
all_time <- by_week %>%
full_join(by_month) %>%
mutate(helper = across(c(contains("Month")), ~paste(.))) %>%
mutate(across(c(contains("Month")), ~ifelse(duplicated(helper), NA, .)), .keep="unused") %>%
mutate(Month = as.Date(Month))
#Makes plot - this is where I want the order in the legend to be different
all_time %>%
ggplot(aes(x = Week)) +
geom_line(aes(y= Week_score, colour = "Week_score")) +
geom_line(data=all_time[!is.na(all_time$Month_score),], aes(y = Month_score, colour = "Month_score")) + #This line tells R just to focus on non-missing values for Month_score
scale_colour_discrete(labels = c("Week_score" = "Score - Week", "Month_score" = "Score - Month"), limits = c("Week_score", "Month_score"))
输出:
如您所见,标签的顺序已更改。
我使用 ggplot 按不同的时间尺度(周、月、季度等)绘制相同的数据,因此,我从不同的列中提取数据。但是,当我看到我的图例时,我希望它是一个特定的顺序。
我知道如果所有分组变量都在一列中,我可以将其设置为有序因子,因为它解释了 here, but my data are spread across multiple columns. I also tried the suggestions
因为我的实际数据集非常复杂,所以我复制了一个只有周和月数据的较小版本。对于最终答案,请允许它指定一个特定的顺序,而不是像 rev()
这样的东西,因为在我的实际数据集中,我有 6 列需要一个特定的顺序。
这是要重现的代码——为此,前 3 个块构成了数据集,因此只有第 4 个块构成了绘图应该与实际解决方案相关。 R 显示顺序的默认设置是在图例中首先显示 'Score - Month',所以我想看看如何将其设为第二个。
library(dplyr)
library(ggplot2)
library(lubridate)
#Generates week data -- shouldn't be relevant to troubleshoot
by_week <- tibble(Week = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="weeks"),
Week_score = c(sample(100:200, 79)),
Month = ymd(format(Week, "%Y-%m-01")))
#Generates month data -- shouldn't be relevant to troubleshoot
by_month <- tibble(Month = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="months"),
Month_score = c(sample(150:200, 19)))
#Joins data and removes duplications of month data for easier plotting -- shouldn't be relevant to troubleshoot
all_time <- by_week %>%
full_join(by_month) %>%
mutate(helper = across(c(contains("Month")), ~paste(.))) %>%
mutate(across(c(contains("Month")), ~ifelse(duplicated(helper), NA, .)), .keep="unused") %>%
mutate(Month = as.Date(Month))
#Makes plot - this is where I want the order in the legend to be different
all_time %>%
ggplot(aes(x = Week)) +
geom_line(aes(y= Week_score, colour = "Week_score")) +
geom_line(data=all_time[!is.na(all_time$Month_score),], aes(y = Month_score, colour = "Month_score")) + #This line tells R just to focus on non-missing values for Month_score
scale_colour_discrete(labels = c("Week_score" = "Score - Week", "Month_score" = "Score - Month"))
这是当前图例的样子 -- 我希望使用可扩展到 2 个以上选项的解决方案切换顺序。谢谢!
正如@stefan 在评论中提到的那样,您应该在 scale_colour_discrete
的 limits
选项中设置标签的名称。您可以自己添加更多列。您可以使用以下代码:
library(dplyr)
library(ggplot2)
library(lubridate)
#Generates week data -- shouldn't be relevant to troubleshoot
by_week <- tibble(Week = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="weeks"),
Week_score = c(sample(100:200, 79)),
Month = ymd(format(Week, "%Y-%m-01")))
#Generates month data -- shouldn't be relevant to troubleshoot
by_month <- tibble(Month = seq(as.Date("2011-01-01"), as.Date("2012-07-01"), by="months"),
Month_score = c(sample(150:200, 19)))
#Joins data and removes duplications of month data for easier plotting -- shouldn't be relevant to troubleshoot
all_time <- by_week %>%
full_join(by_month) %>%
mutate(helper = across(c(contains("Month")), ~paste(.))) %>%
mutate(across(c(contains("Month")), ~ifelse(duplicated(helper), NA, .)), .keep="unused") %>%
mutate(Month = as.Date(Month))
#Makes plot - this is where I want the order in the legend to be different
all_time %>%
ggplot(aes(x = Week)) +
geom_line(aes(y= Week_score, colour = "Week_score")) +
geom_line(data=all_time[!is.na(all_time$Month_score),], aes(y = Month_score, colour = "Month_score")) + #This line tells R just to focus on non-missing values for Month_score
scale_colour_discrete(labels = c("Week_score" = "Score - Week", "Month_score" = "Score - Month"), limits = c("Week_score", "Month_score"))
输出:
如您所见,标签的顺序已更改。