使用ggplot进行对数转换后如何将x轴设置为相同的比例
How to set x-axes to the same scale after log-transformation with ggplot
数周以来,我一直在尝试解决一个看似简单的问题,同时使用 ggplot 绘制两个独立的直方图。因为数据不服从正态分布,所以我对它们进行对数转换。但是,我无法缩放设置独立图的 X 轴以显示完全相同的比例。
这是一个例子:
#random data:
set.seed(123); g1 <- data.frame(rlnorm(1000, 1, 3))
set.seed(123); g2 <- data.frame(rlnorm(2000, 0.4, 1.2))
colnames(g1) <- "value"; colnames(g2) <- "value"
#plotting g1 in logscale
plot_g1 <- ggplot(g1, aes(x=value)) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size =25, base_line_size = 0.5)
plot_g1.2 <- ggplot(g1, aes(x=value)) +
geom_histogram(binwidth=2.5, position = "identity", aes(y=..density..), alpha = 0.75) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size = 10, base_line_size = 0.5)
plot_g1.2_log <- plot_g1.2 +
scale_x_continuous(trans="log2", labels = scales::number_format(accuracy = 0.01, decimal.mark = '.'), breaks = c(0, 0.01, 0.1, 1, 10, 100, 10000), limits=c(-100, 20000))
[![plot_g1.2_log][1]][1]
绘图没问题,但每个 X 轴的比例不同。我玩过 limits、binwidth 和 breaks,但我无法让它工作。
一个解决方案是将两个分布绘制在一起:
###combining both plots together
g1$cat <- "g1"; g2$cat <- "g2" ; g12 <- rbind(g1,g2)
plot_g12 <- ggplot(g12, aes(x=value, fill = cat, color = cat)) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size =10, base_line_size = 0.5)
plot_g12.2 <- ggplot(g12, aes(x=value, fill = cat, color = cat)) +
geom_histogram(binwidth=0.5, position = "identity", aes(y=..density..), alpha = 0.75) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size = 10, base_line_size = 0.5)
plot_g12.2_log <- plot_g12.2 +
scale_x_continuous(trans="log2", labels = scales::number_format(accuracy = 0.01, decimal.mark = '.'), breaks = c(0, 0.01, 0.1, 1, 10, 100, 10000), limits=c(-10, 20000))
plot_g12.2_log
但我需要将它们分开。
如果有人能帮助我,我将不胜感激。
最佳,
L
我认为您无法设置相同比例的原因是因为下限在 log-space 中无效,例如log2(-100)
的计算结果为 NaN
。也就是说,您是否考虑过对数据进行分面?
library(ggplot2)
set.seed(123); g1 <- data.frame(rlnorm(1000, 1, 3))
set.seed(123); g2 <- data.frame(rlnorm(2000, 0.4, 1.2))
colnames(g1) <- "value"; colnames(g2) <- "value"
df <- rbind(
cbind(g1, name = "G1"),
cbind(g2, name = "G2")
)
ggplot(df, aes(value)) +
geom_histogram(aes(y = after_stat(density)),
binwidth = 0.5) +
geom_density() +
scale_x_continuous(
trans = "log2",
labels = scales::number_format(accuracy = 0.01, decimal.mark = '.'),
breaks = c(0, 0.01, 0.1, 1, 10, 100, 10000), limits=c(1e-3, 20000)) +
facet_wrap(~ name)
#> Warning: Removed 4 rows containing non-finite values (stat_bin).
#> Warning: Removed 4 rows containing non-finite values (stat_density).
#> Warning: Removed 4 rows containing missing values (geom_bar).
由 reprex package (v1.0.0)
于 2021 年 3 月 20 日创建
数周以来,我一直在尝试解决一个看似简单的问题,同时使用 ggplot 绘制两个独立的直方图。因为数据不服从正态分布,所以我对它们进行对数转换。但是,我无法缩放设置独立图的 X 轴以显示完全相同的比例。
这是一个例子:
#random data:
set.seed(123); g1 <- data.frame(rlnorm(1000, 1, 3))
set.seed(123); g2 <- data.frame(rlnorm(2000, 0.4, 1.2))
colnames(g1) <- "value"; colnames(g2) <- "value"
#plotting g1 in logscale
plot_g1 <- ggplot(g1, aes(x=value)) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size =25, base_line_size = 0.5)
plot_g1.2 <- ggplot(g1, aes(x=value)) +
geom_histogram(binwidth=2.5, position = "identity", aes(y=..density..), alpha = 0.75) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size = 10, base_line_size = 0.5)
plot_g1.2_log <- plot_g1.2 +
scale_x_continuous(trans="log2", labels = scales::number_format(accuracy = 0.01, decimal.mark = '.'), breaks = c(0, 0.01, 0.1, 1, 10, 100, 10000), limits=c(-100, 20000))
[![plot_g1.2_log][1]][1]
绘图没问题,但每个 X 轴的比例不同。我玩过 limits、binwidth 和 breaks,但我无法让它工作。
一个解决方案是将两个分布绘制在一起:
###combining both plots together
g1$cat <- "g1"; g2$cat <- "g2" ; g12 <- rbind(g1,g2)
plot_g12 <- ggplot(g12, aes(x=value, fill = cat, color = cat)) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size =10, base_line_size = 0.5)
plot_g12.2 <- ggplot(g12, aes(x=value, fill = cat, color = cat)) +
geom_histogram(binwidth=0.5, position = "identity", aes(y=..density..), alpha = 0.75) +
labs(x = "value", y = "Frequency") +
geom_density(alpha=0.25)+
theme_classic(base_size = 10, base_line_size = 0.5)
plot_g12.2_log <- plot_g12.2 +
scale_x_continuous(trans="log2", labels = scales::number_format(accuracy = 0.01, decimal.mark = '.'), breaks = c(0, 0.01, 0.1, 1, 10, 100, 10000), limits=c(-10, 20000))
plot_g12.2_log
但我需要将它们分开。 如果有人能帮助我,我将不胜感激。
最佳,
L
我认为您无法设置相同比例的原因是因为下限在 log-space 中无效,例如log2(-100)
的计算结果为 NaN
。也就是说,您是否考虑过对数据进行分面?
library(ggplot2)
set.seed(123); g1 <- data.frame(rlnorm(1000, 1, 3))
set.seed(123); g2 <- data.frame(rlnorm(2000, 0.4, 1.2))
colnames(g1) <- "value"; colnames(g2) <- "value"
df <- rbind(
cbind(g1, name = "G1"),
cbind(g2, name = "G2")
)
ggplot(df, aes(value)) +
geom_histogram(aes(y = after_stat(density)),
binwidth = 0.5) +
geom_density() +
scale_x_continuous(
trans = "log2",
labels = scales::number_format(accuracy = 0.01, decimal.mark = '.'),
breaks = c(0, 0.01, 0.1, 1, 10, 100, 10000), limits=c(1e-3, 20000)) +
facet_wrap(~ name)
#> Warning: Removed 4 rows containing non-finite values (stat_bin).
#> Warning: Removed 4 rows containing non-finite values (stat_density).
#> Warning: Removed 4 rows containing missing values (geom_bar).
由 reprex package (v1.0.0)
于 2021 年 3 月 20 日创建