ggplot - 直方图 - 按组划分的密度图:只出现一组
ggplot - histogram - density plot by group : Only one group appears
我有一个小数据集,我想使用 ggplot 按组绘制 histogram/density 图。我的数据集如下:
> data_Test_augm
mpg cyl4 cyl6 cyl8 disp hp drat wt qsec vs0 vs1 gear3 gear4 gear5 carb1 carb2 carb3 carb4 carb6 carb8 am predictions
1: 21.0 0 1 0 160.0 110 3.90 2.620 16.46 1 0 0 1 0 0 0 0 1 0 0 1 1.000000e+00
2: 21.4 0 1 0 258.0 110 3.08 3.215 19.44 0 1 1 0 0 1 0 0 0 0 0 0 7.884922e-12
3: 17.8 0 1 0 167.6 123 3.92 3.440 18.90 0 1 0 1 0 0 0 0 1 0 0 0 7.884924e-12
4: 32.4 1 0 0 78.7 66 4.08 2.200 19.47 0 1 0 1 0 1 0 0 0 0 0 1 1.000000e+00
5: 30.4 1 0 0 75.7 52 4.93 1.615 18.52 0 1 0 1 0 0 1 0 0 0 0 1 7.884886e-12
6: 19.2 0 0 1 400.0 175 3.08 3.845 17.05 1 0 1 0 0 0 1 0 0 0 0 0 7.884923e-12
7: 26.0 1 0 0 120.3 91 4.43 2.140 16.70 1 0 0 0 1 0 1 0 0 0 0 1 1.000000e+00
8: 30.4 1 0 0 95.1 113 3.77 1.513 16.90 0 1 0 0 1 0 1 0 0 0 0 1 7.884916e-12
9: 19.7 0 1 0 145.0 175 3.62 2.770 15.50 1 0 0 0 1 0 0 0 0 1 0 1 1.000000e+00
10: 21.4 1 0 0 121.0 109 4.11 2.780 18.60 0 1 0 1 0 0 1 0 0 0 0 1 7.884918e-12
我的代码如下:
data_Test_augm$am <- factor(data_Test_augm$am, levels = c("1" , "0")) #Sets the faactor levels in the desired order
ggplot(data_Test_augm, aes(x = predictions, fill = am , color = am )) +
geom_histogram(aes(y=..density..), position="identity",alpha = 0.4) + guides(color = FALSE) +
geom_density (alpha = 0.5)+
labs(title = "Predicted Probabilities per am in the Test Dataset",
x = "Predicted Probability of being in am 1", y = "Count") +
scale_fill_manual(limits=c('1', '0'), # Defines the mapping between factor levels, labels and colors
labels = c("Positive", "Negative"),
values = c("red", 'blue')) +
labs(fill="am")+ # Sets the title of the legend
guides(color=FALSE) # Hides the legend for Color
这是输出:
我不明白为什么分组变量 am 中只有一个水平出现在图中。
我觉得更像是一个评论,但是因为我的声望太低而写不出来
你确定要密度吗?没有它你会得到一个很好的直方图:
dt <- data.frame(am = c(1,0,0,1,1,0,1,1,1,1),
predictions = c(1,7.884922e-12,7.884924e-12,1,7.884886e-12,7.884923e-12,1,7.884916e-12,1,7.884918e-12))
dt$am <- factor(dt$am, levels = c("1" , "0"))
g <- ggplot(dt, aes(x = predictions, fill = am))
g+geom_histogram(aes(y=..density..), position="identity",alpha = 0.4) +
labs(title = "Predicted Probabilities per am in the Test Dataset",
x = "Predicted Probability of being in am 1", y = "Count") +
scale_fill_manual(limits=c('1', '0'),
labels = c("Positive", "Negative"),
values = c("red", 'blue')) +
labs(fill="am")
我刚删除了geom_density
(还有颜色映射,你已经有填充了!)。
这就是我得到的:
Histogram
密度问题可能是因为数据很奇怪。对于 am = 0
,您有 100% 的概率进入 7.88492e-12 和 7.88493e-12 之间的区间。虽然它甚至不使用你的颜色...
我有一个小数据集,我想使用 ggplot 按组绘制 histogram/density 图。我的数据集如下:
> data_Test_augm
mpg cyl4 cyl6 cyl8 disp hp drat wt qsec vs0 vs1 gear3 gear4 gear5 carb1 carb2 carb3 carb4 carb6 carb8 am predictions
1: 21.0 0 1 0 160.0 110 3.90 2.620 16.46 1 0 0 1 0 0 0 0 1 0 0 1 1.000000e+00
2: 21.4 0 1 0 258.0 110 3.08 3.215 19.44 0 1 1 0 0 1 0 0 0 0 0 0 7.884922e-12
3: 17.8 0 1 0 167.6 123 3.92 3.440 18.90 0 1 0 1 0 0 0 0 1 0 0 0 7.884924e-12
4: 32.4 1 0 0 78.7 66 4.08 2.200 19.47 0 1 0 1 0 1 0 0 0 0 0 1 1.000000e+00
5: 30.4 1 0 0 75.7 52 4.93 1.615 18.52 0 1 0 1 0 0 1 0 0 0 0 1 7.884886e-12
6: 19.2 0 0 1 400.0 175 3.08 3.845 17.05 1 0 1 0 0 0 1 0 0 0 0 0 7.884923e-12
7: 26.0 1 0 0 120.3 91 4.43 2.140 16.70 1 0 0 0 1 0 1 0 0 0 0 1 1.000000e+00
8: 30.4 1 0 0 95.1 113 3.77 1.513 16.90 0 1 0 0 1 0 1 0 0 0 0 1 7.884916e-12
9: 19.7 0 1 0 145.0 175 3.62 2.770 15.50 1 0 0 0 1 0 0 0 0 1 0 1 1.000000e+00
10: 21.4 1 0 0 121.0 109 4.11 2.780 18.60 0 1 0 1 0 0 1 0 0 0 0 1 7.884918e-12
我的代码如下:
data_Test_augm$am <- factor(data_Test_augm$am, levels = c("1" , "0")) #Sets the faactor levels in the desired order
ggplot(data_Test_augm, aes(x = predictions, fill = am , color = am )) +
geom_histogram(aes(y=..density..), position="identity",alpha = 0.4) + guides(color = FALSE) +
geom_density (alpha = 0.5)+
labs(title = "Predicted Probabilities per am in the Test Dataset",
x = "Predicted Probability of being in am 1", y = "Count") +
scale_fill_manual(limits=c('1', '0'), # Defines the mapping between factor levels, labels and colors
labels = c("Positive", "Negative"),
values = c("red", 'blue')) +
labs(fill="am")+ # Sets the title of the legend
guides(color=FALSE) # Hides the legend for Color
这是输出:
我不明白为什么分组变量 am 中只有一个水平出现在图中。
我觉得更像是一个评论,但是因为我的声望太低而写不出来
你确定要密度吗?没有它你会得到一个很好的直方图:
dt <- data.frame(am = c(1,0,0,1,1,0,1,1,1,1),
predictions = c(1,7.884922e-12,7.884924e-12,1,7.884886e-12,7.884923e-12,1,7.884916e-12,1,7.884918e-12))
dt$am <- factor(dt$am, levels = c("1" , "0"))
g <- ggplot(dt, aes(x = predictions, fill = am))
g+geom_histogram(aes(y=..density..), position="identity",alpha = 0.4) +
labs(title = "Predicted Probabilities per am in the Test Dataset",
x = "Predicted Probability of being in am 1", y = "Count") +
scale_fill_manual(limits=c('1', '0'),
labels = c("Positive", "Negative"),
values = c("red", 'blue')) +
labs(fill="am")
我刚删除了geom_density
(还有颜色映射,你已经有填充了!)。
这就是我得到的:
Histogram
密度问题可能是因为数据很奇怪。对于 am = 0
,您有 100% 的概率进入 7.88492e-12 和 7.88493e-12 之间的区间。虽然它甚至不使用你的颜色...