如何修改此散点图以包含基于第三列数据的层次结构?

How can I modify this scatterplot to include a hierarchy based on a 3rd column of data?

我想绘制 PM2.5 与预期寿命的散点图,其中我想要基于 GDP 数据的 5 个子类别(5 个不同的颜色图和线基于 GDP 从高到低)。我将如何修改我当前的代码来执行此操作(或类似操作)?下面的代码和数据,非常感谢任何帮助。

plot = ggplot(dat6, aes(x=log(PM2.5), y= log(Lifeex))) +
  geom_point(colour = 'blue') +
  stat_smooth(method = "lm", col = "red") + 
  xlab("Life Expectancy") +
  ylab("Concentration of PM2.5") +
  ggtitle("Relationship between Life expectancy and PM2.5")



dat6
                 Country Life_Expectancy         GDP     PM2.5
1                Afghanistan        60.38333   1788.3152 53.933333
2                    Albania        77.03333  10642.3801 20.408333
3                    Algeria        75.16667  13674.2199 31.521667
4                     Angola        51.96667   6770.9149 37.346667
5        Antigua and Barbuda        75.98333  20893.5925 20.415000
6                  Argentina        75.93333  19838.7166 11.893333
7                    Armenia        74.26667   7728.3425 33.143333
8                  Australia        82.36667  43862.4894  7.338333
9                    Austria        84.00000  46586.1927 14.303333
10                Azerbaijan        72.00000  16804.9607 20.308333

这是问题要求的示例。

cut 用于根据断点向量 brks 创建新列 GDP_Level。级别已指定名称,范围从 "Very Low""Very High"

至于情节,我已经从坐标代码中删除了 log 变换,然后将其作为变换包含在两个 scale_*continuous 中。

dat6 <- read.table(text = "
                 Country Life_Expectancy         GDP     PM2.5
1                Afghanistan        60.38333   1788.3152 53.933333
2                    Albania        77.03333  10642.3801 20.408333
3                    Algeria        75.16667  13674.2199 31.521667
4                     Angola        51.96667   6770.9149 37.346667
5        'Antigua and Barbuda'        75.98333  20893.5925 20.415000
6                  Argentina        75.93333  19838.7166 11.893333
7                    Armenia        74.26667   7728.3425 33.143333
8                  Australia        82.36667  43862.4894  7.338333
9                    Austria        84.00000  46586.1927 14.303333
10                Azerbaijan        72.00000  16804.9607 20.308333
", header = TRUE)

library(ggplot2)

brks <- c(0, 5000, 10000, 20000, 40000, Inf)
dat6$GDP_Level <- cut(dat6$GDP, breaks = brks, labels = c("Very Low", "Low", "Medium", "High", "Very High"))

ggplot(dat6, aes(x = PM2.5, y = Life_Expectancy, color = GDP_Level)) +
  geom_point(colour = 'blue') +
  stat_smooth(formula = y ~ x, method = "lm", col = "red") + 
  xlab("Life Expectancy") +
  ylab("Concentration of PM2.5") +
  scale_x_continuous(trans = "log") +
  scale_y_continuous(trans = "log") +
  ggtitle("Relationship between Life expectancy and PM2.5")

reprex package (v2.0.1)

创建于 2022-02-21